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A NOVEL HAEMOPOIETIN RECEPTOR AND GENETIC 
SEQUENCES ENCODING SAME 

The present invention relates generally to a novel 
haemopoietin receptor or derivatives thereof and to 
genetic sequences encoding same. Interaction between 
the novel receptor of the present invention and a ligand 
facilitates proliferation, differentiation and survival 
of a wide variety of cells. The novel receptor and its 
derivatives and the genetic sequences encoding same of 
the present invention are useful in the development of a 
wide range of agonists, antagonists, therapeutics and 
diagnostic reagents based on ligand interaction with its 
receptor . 

Bibliographic details of the publications numerically 
referred to in this specification are collected at the 
end of the description. Sequence Identity Numbers (SEQ 
ID NOs.) for the nucleotide and amino acid sequences 
referred to in the specification are defined following 
the bibliography. 



Throughout this specification and the claims which 
follow, unless the context requires otherwise, the word 
25 "comprise", or variations such as "comprises" or 

"comprising", will be understood to imply the inclusion 
of a stated integer or group of integers but not the 
exclusion of any other integer or group of integers. 



The rapidly increasing sophistication of recombinant DNA 
techniques is greatly facilitating research into the 
medical and allied health fields. Cytokine research is 
of particular importance, especially as these molecules 
regulate the proliferation, differentiation and function 
35 of a wide variety of cells. Administration of 

recombinant cytokines or regulating cytokine function 
and/or synthesis is becoming increasingly the focus of 
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medical research into the treatment of a range of 
disease conditions. 

Despite the discovery of a range of cytokines and other 
secreted regulators of cell function, comparatively few 
cytokines are directly used or targeted in therapeutic 
regimens. One reason for this is the pleiotropic nature 
of many cytokines. For example, interleukin (IL) -11 is 
a functionally pleiotropic molecule (1,2), initially 
characterized by its ability to stimulate proliferation 
of the IL- 6 -dependent plasmacytoma cell line, Til 65 
(3). Other biological actions of IL-11 include 
induction of multipotential haemopoietin progenitor cell 
proliferation (4,5,6), enhancement of megakaryocyte and 
15 platelet formation (7,8,9,10), stimulation of acute 

phase protein synthesis (11) and inhibition of adipocyte 
lipoprotein lipase activity (12, 13) . 

Other important cytokines in the IL-11 group include IL- 
20 6, leukaemia inhibitory factor (LIF) , oncostatin M (OSM) 
and CNTF. All these cytokines exhibit pleiotropic 
properties with significant activities in proliferation, 
differentiation and survival of cells. Members of the 
haemopoietin receptor family are defined by the presence 
25 of a conserved amino acid domain in their extracellular 
region. However, despite the low level of amino acid 
sequence conservation between other haemopoietin 
receptor domains of different receptors, they are all 
predicted to assume a similar tertiary structure, 
centred around two f ibronectin-type III repeats (18,19). 

The size of the haemopoietin receptor family has now 
become extensive and includes the cell surface receptors 
for may cytokines including interleukin-2 (IL-2) , IL-3, 
IL-4, IL-5, IL-6, IL-7, IL-9, IL-11, IL-12, IL-13, IL- 
15, granulocyte colony stimulating factor (G-CSF) , 
granulocyte-macrophage-CSF (GM-CSF) , erythropoietin, 

- 2 - 
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thrombopoietin, leptin, leukaemia inhibitory factor, 
oncostatin-M, ciliary neurotrophic factor, 
cardiotrophin, growth hormone and prolactin. Although 
most of the members of the haemopoietin receptor family 
5 act as classic cell surface receptors, binding their 
cognate ligand at the cell surface and initiating 
intracellular signal transduction, some receptors are 
also produced in naturally occuuring soluble forms. 
These soluble receptors can either act as cytokine 
antagonists, by binding to cytokines and inhibiting 
productive interactions with cell surface receptors (eg 
LIF binding protein; (20) or as agonists, binding to 
cytokine and potentiating interaction with cell surface 
receptor components (eg soluble interleukin-6 receptor 
15 a-chain; (21) . Still other members of the family appear 
to be produced only as secreted proteins, with no 
evidence of a cell surface form. in this regard, the 
IL-12 P 40 subunit is a useful example. The cytokine XL- 
12 is secreted as a heterodimer composed of a p35 
subunit which shows similarity to cytokines such as IL-6 
(22) and a p40 subunit which shares similarity with the 
IL-6 receptor a-chain (23). In this case the soluble 
receptor acts as part of the cytokine itself and 
essential to formation of an active protein. In 
addition to acting as cytokines (eg IL-I2p40) , cytokine 
agonists (eg IL-6 receptor a-chain) or cytokine 
antagonists (LIF binding protein) , members of the 
haemopoietin receptor have been useful in the discovery 
of small molecule cytokine mimetics. For example, the 
discovery of peptide mimetics of two commercially 
valuable cytokines, erythropoietin and thrombopoietin, 
centred on the selection of peptides capable of binding 
to soluble versions of the erythropoietin and 
thrombopoietin receptors (24,25). Due to the importance 
and multifactorial nature of these cytokines, there is a 
need to identify receptors, including both cell bound 
and soluble, for pleiotropic cytokines. Identification 
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of such receptors permits the identification of 
pleiotropic cytokines and the development of a range of 
therapeutic and diagnostic agents. 

5 Accordingly, one aspect of the present invention relates 
to a nucleic acid molecule comprising a sequence of 
nucleotides encoding or complementary to a sequence 
encoding a novel haemopoietin receptor or a derivative 
thereof . 

10 

More particularly, the present invention provides a 
nucleic acid molecule comprising a sequence of 
nucleotides encoding or complementary to a sequence 
encoding a novel haemopoietin receptor or a derivative 
15 thereof having the motif: v 

Trp Ser Xaa Trp Ser [SEQ ID NO:l], 
wherein Xaa is any amino acid and is preferably Asp or 
Glu. 

20 Even more particularly, the present invention is 
directed to a nucleic acid molecule comprising a 
sequence of nucleotides encoding or complementary to a 
sequence encoding a novel haemopoietin receptor or a 
derivative thereof, said receptor comprising the motif: 

25 

Trp Ser Xaa Trp Ser [SEQ ID NO:l] 

wherein Xaa is any amino acid and is preferably Asp or 
Glu, said nucleic acid molecule is identifiable by 
3 0 hybridisation to said molecule under low stringency 
conditions at 42EC with 

5N (A/G)CTCCA(A/G)TC(A/G)CTCCA 3N [SEQ ID NO: 7] 
and 

5N (A/G)CTCCA(C/T)TC(A/G)CTCCA 3N [SEQ ID N0:8]. 

35 

Still more particularly, the present invention provides 
an isolated nucleic acid molecule comprising a sequence 
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of nucleotides substantially as set forth in SEQ ID 
NO: 12 or a nucleotide sequence having at least 60% 
similarity to the nucleotide sequence set forth in SEQ 
ID NO: 12 or a nucleotide sequence capable of hybridising 
5 thereto under low stringency conditions at 42EC and 
wherein said nucleotide sequence encodes a novel 
haemopoietin receptor or a derivative thereof. 

In a related embodiment, the present invention provides 
10 an isolated nucleic acid molecule comprising a sequence 
of nucleotides substantially as set forth in SEQ ID 
NO: 14 or a nucleotide sequence having at least 60% 
similarity to the nucleotide sequence set forth in SEQ 
ID NO: 14 or a nucleotide sequence capable of hybridising 
15 thereto under low stringency conditions at 42EC and 
wherein said nucleotide sequence encodes a novel 
haemopoietin receptor or a derivative thereof. 

In another related embodiment, the present invention 
20 provides an isolated nucleic acid molecule comprising a 
sequence of nucleotides substantially as set forth in 
SEQ ID NO: 16 or a nucleotide sequence having at least 
60% similarity to the nucleotide sequence set forth in 
SEQ ID NO: 16 or a nucleotide sequence capable of 
25 hybridising thereto under low stringency conditions at 
42EC and wherein said nucleotide sequence encodes a 
novel haemopoietin receptor or a derivative thereof. 

In a further related embodiment, the present invention 
30 provides an isolated nucleic acid molecule comprising a 
sequence of nucleotides substantially as set forth in 
SEQ ID NO: 18 or a nucleotide sequence having at least 
60% similarity to the nucleotide sequence set forth in 
SEQ ID NO: 18 or a nucleotide sequence capable of 
35 hybridising thereto under low stringency conditions at 
42EC and wherein said nucleotide sequence encodes a 
novel haemopoietin receptor or a derivative thereof. 

- 5 - 
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In yet a further related embodiment, the present 
invention provides an isolated nucleic acid molecule 
comprising a sequence of nucleotides substantially as 
set forth in SEQ ID NO: 24 or a nucleotide sequence 
5 having at least 60% similarity to the nucleotide 

sequence set forth in SEQ ID NO: 24 or a nucleotide 
sequence capable of hybridising thereto under low 
stringency conditions at 42EC and wherein said 
nucleotide sequence encodes a novel haemopoietin 
10 receptor or a derivative thereof. 

Still yet a further embodiment of the present invention 
is directed to a sequence of nucleotides substantially 
as set forth in SEQ ID NO:28 or a nucleotide sequence 

15 having at least 60% similarity to the nucleotide 

sequence set forth in SEQ ID NO: 28 or a nucleotide 
sequence capable of hybridising thereto under low 
stringency conditions at 42EC and wherein said 
nucleotide sequence encodes a novel haemopoietin 

20 receptor or a derivative thereof. 

In still yet another embodiment, the present invention 
provides an isolated nucleic acid molecule comprising a 
sequence of nucleotides substantially set forth in SEQ 

25 ID NO: 3 8 or a nucleotide sequence having at least 60% 
similarity to the nucleotide sequence set forth in SEQ 
ID NO: 38 or a nucleotide sequence capable of hybridising 
thereto under low stringency conditions at 42EC and 
where in said nucleotide sequence encodes a novel 

30 haemopoietin receptor or a derivative thereof. 

The term "receptor" is used in its broadest sense and 
includes any molecule capable of binding, associating or 
otherwise interacting with a ligand. Generally, the 
35 interaction will have a signalling effect although the 
present invention is not necessarily so limited. For 
example, the "receptor" may be in soluble form, often 
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referred to as a cytokine binding protein. A receptor 
may be deemed a receptor notwithstanding that its ligand 
or ligands has or have not been identified. 

5 Preferably, the novel receptor is derived from a mammal 
or a species of bird. Particularly, preferred mammals 
include humans, primates, laboratory test animals (e.g. 
mice, rats, rabbits, guinea pigs), livestock animals 
(e.g. sheep, horses, pigs, cows), companion animals 
10 (e.g. dogs, cats) or captive wild animals (e.g. deer, 
foxes, kangaroos) . Although the present invention is 
exemplified with respect to mice, the scope of the 
subject invention extends to all animals and in 
particular humans. 

15 

The present invention is predicated in part on an 
ability to identify members of the haemopoietin receptor 
family with limited sequence similarity. Based on this 
approach, a genetic sequence has been identified in 

20 accordance with the present invention which encodes a 
novel receptor. The expressed genetic sequence is 
referred to herein as "NR6" . Different forms of NR6 are 
referred to as, for example, NR6.1, NR6.2 and NR6.3. 
The nucleotide and corresponding amino acid sequences 

25 for these molecules are represented in SEQ ID NOs:12, 14 
and 16, respectively. 

Preferred human and murine nucleic acid sequences for 
NR6 or its derivatives include sequences from brain, 
30 liver, kidney, neonatal, embryonic, cancer or tumour- 
derived tissues. 

Reference herein to a low stringency at 42EC includes 
and encompasses from at least about 1% v/v to at least 
35 about 15% v/v formamide and from at least about 1M to at 
least about 2M salt for hybridisation, and at least 
about 1M to at least about 2M salt for washing 
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conditions. Alternative stringency conditions may be 
applied where necessary, such as medium stringency, 
which includes and encompasses from at least about 16% 
v/v to at least about 30% v/v formamide and from at 
5 least about 0.5M to at least about 0.9M salt for 

hybridisation, and at least about 0 . 5M to at least about 
0.9M salt for washing conditions, or high stringency, 
which includes and encompasses from at least about 31% 
v/v to at least about 50% v/v formamide and from at 
10 least about 0.01M to at least about 0 . 15M salt for 
hybridisation, and at least about 0.01M to at least 
about 0.15M salt for washing conditions. 

The nucleic acid molecules contemplated by the present 
15 invention are generally in isolated form and are 
preferably cDNA or genomic DNA molecules. In a 
particularly preferred embodiment, the nucleic acid 
molecules are in vectors and most preferably expression 
vectors to enable expression in a suitable host cell. 
20 Particularly useful host cells include prokaryotic 

cells, mammalian cells, yeast cells and insect cells. 
The cells may also be in the form of a cell line. 

Accordingly, another aspect of the present invention 
25 provides an expression vector comprising a nucleic acid 
molecule encoding the novel haempoietin receptor or a 
derivative thereof as hereinbefore described, said 
expression vector capable of expression in a selected 
host cell. 

30 

Another aspect of the present invention contemplates a 
method for cloning a nucleotide sequence encoding NR6 or 
a derivative thereof, said method comprising searching a 
nucleotide data base for a sequence which encodes the 
3 5 amino acid sequence set forth in SEQ ID NO:l, designing 
one or more oligonucleotide primers based on the 
nucleotide sequence located in the search, screening a 
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nucleic acid library with said one or more 
oligonucleotides and obtaining a clone therefrom which 
encodes said NR6 or part thereof. 

5 Once a novel nucleotide sequence is obtained as 

indicated above encoding NR6, oligonucleotides may be 
designed which bind cDNA clones with high stringency. 
Direct colony hybridisation may be employed or PCR 
amplification may be used. The use of oligonucleotide 
primers which bind under conditions of high stringency 
ensures rapid cloning of a molecule encoding the novel 
NR6 and less time is required in screening out cloning 
artefacts. However, depending on the primers used, low 
or medium stringency conditions may also be employed. 

Alternatively, a library may be screened directly such 
as using oligonucleotides set forth in SEQ ID NO: 7 or 
SEQ ID NO: 8 or a mixture of both oligonucleotides may be 
used. In addition, one or more of oligonucleotides 
defined in SEQ ID NO: 2 to 11 may also be used. 

Preferably, the nucleic acid library is a cDNA, genomic, 
cDNA expression or mRNA library. 

25 Preferably, the nucleic acid library is a cDNA 
expression library. 

Preferably, the nucleotide data base is of human or 
murine origin and of brain, liver, kidney, neo-natal 
tissue, embryonic tissue, tumour or cancer tissue 
origin. 

Preferred percentage similarities to the reference 
nucleotide sequences include at least about 70%, more 
preferably at least about 80%, still more preferably at 
least about 90% and even more preferably at least about 
95% or above. 



20 



30 



35 
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Another aspect of the present invention provides an 
isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding a novel haempoietin receptor or 
derivative thereof having an amino acid sequence as set 
5 forth in SEQ ID NO: 13 or having at least about 50% 
similarity to all or part thereof. 

Still yet another aspect of the present invention 
provides an isolated nucleic acid molecule comprising a 
10 sequence of nucleotides encoding a novel haempoietin 
receptor or derivative thereof having an amino acid 
sequence as set forth in SEQ ID NO: 15 or having at least 
about 50% similarity to all or part thereof. 

15 Even yet another aspect of the present invention 

provides an isolated nucleic acid molecule comprising a 
sequence of nucleotides encoding a novel haempoietin 
receptor or derivative thereof having an amino acid 
sequence as set forth in SEQ ID NO: 17 or having at least 

20 about 50% similarity to all or part thereof. 

A further aspect of the present invention provides an 
isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding a novel haempoietin receptor or 
25 derivative thereof having an amino acid sequence as set 
forth in SEQ ID NO: 19 or having at least about 50% 
similarity to all or part thereof. 

Even yet a another aspect of the present invention 
3 0 provides an isolated nucleic acid molecule comprising a 
sequence of nucleotides encoding a novel haempoietin 
receptor or derivative thereof having an amino acid 
sequence as set forth in SEQ ID NO: 25 or having at least 
about 50% similarity to all or part thereof. 

35 

Another aspect of the present invention provides an 
isolated nucleic acid molecule comprising a sequence of 
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10 



30 



nucleotides encoding a novel haempoietin receptor or 
derivative thereof having an amino acid sequence as set 
forth in one or more of SEQ ID NOs:29 or having at least 
about 50% similarity to all or part thereof. 

Preferably, the percentage amino acid similarity is at 
least about 60%, more preferably at least about 70%, 
even more preferably at least about 80-85% and still 
even more preferably at least about 90-95% or greater. 



The NR6 polypeptide contemplated by the present 
invention includes, therefore, derivatives which are 
components, parts, fragments, homologues or analogues of 
the novel haempoietin receptors which are preferably 
15 encoded by all or part of a nucleotide sequences 

substantially set forth in SEQ ID NO: 12 or 14 or 16 or 
18 or 25 or 20 or 24 or 28 or 38 or a molecule having at 
least about 60% nucleotide similarity to all or part 
thereof or a molecule capable of hybridising to the 
20 nucleotide sequence set forth in SEQ ID NO: 12 or 14 or 

16 or 18 or 20 or 24 or 28 or 38 or a complementary form 
thereof. The NR6 molecule may be glycosylated or non- 
glycosylated. When in glycosylated form, the 
glycosylatioh may be substantially the same as naturally 
25 occurring haempoietin receptor or may be a modified form 
of glycosylation. Altered or differential glycosylation 
states may or may not affect binding activity of the 
novel receptor. 



The NR6 haemopoietin receptor may be in soluble form or 
may be expressed on a cell surface or conjugated or 
fused to a solid support or another molecule. 



As stated above, the present invention further 
35 contemplates a range of derivatives of NR6. Derivatives 
include fragments, parts, portions, mutants, homologues 
and analogues of the NR6 polypeptide and corresponding 
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genetic sequence. Derivatives also include single or 
multiple amino acid substitutions, deletions and/or 
additions to NR6 or single or multiple nucleotide 
substitutions, deletions and/or additions to the genetic 

5 sequence encoding NR6 . "Additions" to amino acid 

sequences or nucleotide sequences include fusions with 
other peptides, polypeptides or proteins or fusions to 
nucleotide sequences. Reference herein to ANR6" 
includes reference to all derivatives thereof including 

0 functional derivatives or NR6 immunologically 
interactive derivatives. 



Analogues of NR6 contemplated herein include, but are 
not limited to, modification to side chains, 
incorporating of unnatural amino acids and/or their 
derivatives during peptide, polypeptide or protein 
synthesis and the use of crosslinkers and other methods 
which impose conformational constraints on the 
proteinaceous molecule or their analogues . 

Examples of side chain modifications contemplated by the 
present invention include modifications of amino groups 
such as by reductive alkylation by reaction with an 
aldehyde followed by reduction with NaBH4 ; amidination 
with methylacetimidate; acylation with acetic anhydride ; 
carbamoylation of amino groups with cyanate; 
trinitrobenzylation of amino groups with 2, 4, 6- 
trinitrobenzene sulphonic acid (TNBS) ; acylation of 
amino groups with succinic anhydride and 
tetrahydrophthalic anhydride; and pyridoxylation of 
lysine with pyridoxal- 5 -phosphate followed by reduction 
with NaBH 4 . 

The guanidine group of arginine residues may be modified 
by the formation of heterocyclic condensation products 
with reagents such as 2 , 3-butanedione, phenylglyoxal and 
glyoxal . 



- 12 - 



SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



The carboxyl group may be modified by carbodiimide 
activation via O-acylisourea formation followed by 
subsequent derivitisation, for example, to a 
corresponding amide . 

5 

Sulphydryl groups may be modified by methods such as 
carboxymethylation with iodoacetic acid or 
iodoacetamide; performic acid oxidation to cysteic acid; 
formation of a mixed disulphides with other thiol 

10 compounds; reaction with maleimide, maleic anhydride or 
other substituted maleimide; formation of mercurial 
derivatives using 4 -chloromercuribenzoate, 4- 
chloromercuriphenylsulphonic acid, phenylmercury 
chloride, 2-chloromercuri-4-nitrophenol and other 

15 mercurials; carbamoyl at ion with cyanate at alkaline pH. 

Tryptophan residues may be modified by, for example, 
oxidation with N-bromosuccinimide or alkylation of the 
indole ring with 2-hydroxy-5-nitrobenzyl bromide or 
20 sulphenyl halides. Tyrosine residues on the other hand, 
may be altered by nitration with tetranitromethane to 
form a 3 -nitrotyrosine derivative. 

Modification of the imidazole ring of a histidine 
25 residue may be accomplished by alkylation with 

iodoacetic acid derivatives or N-carbethoxylation with 
diethylpyrocarbonate . 



Examples of incorporating unnatural amino acids and 
derivatives during peptide synthesis include, but are 
not limited to, use of norleucine, 4 -amino butyric acid, 
4-amino-3-hydroxy-5-phenylpentanoic acid, 6- 
aminohexanoic acid, t-butylglycine , norvaline, 
phenylglycine, ornithine, sarcosine, 4 -amino- 3- hydroxy - 
6-methylheptanoic acid, 2-thienyl alanine and/or D- 
isomers of amino acids. A list of unnatural amino acid, 
contemplated herein is shown in Table 1. 
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These types of modifications may be important to 
stabilise NR6 if administered to an individual or for 
use as a diagnostic reagent. 

5 Crosslinkers can be used, for example, to stabilise 3D 

conformations, using homo-bif unctional crosslinkers such 
as the bif unctional imido esters having (CH 2 ) n spacer 
groups with n=l to n=6, glutaraldehyde, N- 
hydroxysuccinimide esters and hetero-bifunctional 
reagents which usually contain an amino-reactive moiety 
such as N-hydroxysuccinimide and another group specific- 
reactive moiety such as maleimido or dithio moiety (SH) 
or carbodiimide (COOH) . In addition, peptides can be 
conformationally constrained by, for example, 
15 incorporation of C" and N .-methyl amino acids, 

introduction of double bonds between C and C 5 atoms of 
amino acids and the formation of cyclic peptides or 
analogues by introducing covalent bonds such as forming 
an amide bond between the N and C termini, between two 
side chains or between a side chain and the N or C 
terminus . 



10 



20 
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TABLE 1 



10 



15 



20 



25 



30 



35 



Non - convent iona 1 


Code 


Non - conventional 


code 


amino acid 




amino acid 




aminobutyric acid 


Abu 


Tj-N-mf* t" al an i w a 
u it wc uiiy lalallinc 




Nmala 


Amino - M -methvlhutvratp 




Lt-w-mecnyiarginine 


Nmarg 


arninorvc 1 onrrmanp — 




L-N- methylasparagine 


Nmasn 


carboxvlate 




-mecnyxaspartic acid 


Nmasp 


aminoisobutyxic acid 


Aib 


li-n-mecnyicysteine 


Nmcys 


aminonorbomyl - 


Norb 


u c* metnyiyiutatninc 


Nmgln 


carboxvlate 




L-N-methylglutamic acid 


Nmglu 


cvclohexvlalani tja 




cnexatr-N-metnylnistidine 


Nmhis 


cvcloTientvlalaninp 


Cpen 


L-N-methyli sol leucine 


Nmile 


D - alanine 


Dal 


L-N -methyl leucine 


Nmleu 


D - arg inine 


UAL y 


L-N-methyllysine 


Nmlys 


D-asDartic acid 




l» - xm - me cny l me tn j. onine 


Nmmet 


D-cvfiteine 


Dcys 


L-N-me thy lnor leucine 


Nmnle 




ugin 


L - N - me t hy 1 norval ine 


Nmnva 


D-crlutamic acid 


Dal ii 
ivy x u 


jj-w -metnyiormtnine 


Nmom 


D-histidine 


um s 


L-N-methylphenylalanine 


Nmphe 


D -isoleucinp 


flic 


i»-N-mennyiproiine 


Nmpro 


D-leucine 


uieu 


L»-w - me cny ± serine 


Nmser 


a-/ / ojLiicr 


jjiys 


L-N-me thy 1 threonine 


Nmthr 


u-cne union me 


Dmet 


L-N-me thyl tryptophan 


Nmtrp 




Dom 


L-N-me thy 1 tyrosine 


Nmtyr 


Ti - t->H on vl a 1 aniTin 


jjpne 


L-N -methy lval ine 


Nmval 


D-Droline 


Dpro 


L-N-metnyletnylglycme 


Nmetg 


D-serine 


Dser 


L-N-metnyl-t-butylglycine 


Nmtbug 


D - threoninp 


jjcnr 


L-norleucine 


Nle 


D - 1 rvn t ochan 


u tr p 


L-norvaline 


Nva 


D- tyrosine 


Dtyr 


M -methyl -aminoisobutyrate 


Maib 


D -valine 


Dval 


n -methyl- ( -aminobutyrat e 


Mgabu 


-metnylalanine 


Dmala 


" -methylcyclohexylalanine 


Mchexa 


D- w -methy larginine 


Dmarg 


H -methylcylcopentylalanine Mcpen 


D- - -methylasparagine 


Dmasn 


" -methyl - " -napthylalanine 


Manap 


D- w -methylaspartate 


Dmasp 


M -methylpenicillamine 


Mpen 
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D - " - me thy 1 cy s t e ine Dmcy s 

D- " -methylglutamine Dmgln 
D- " -methylhistidine Dmhis 
D- " -methylisoleucine Dmile 
5 D-"-methylleucine Dmleu 

D- " -methyllysine Dmlys 

D- " -methylmethionine Dmmet 

D- n -me thy lorni thine Dmorn 

D- " -methylphenylalanine Dmphe 

10 D-"-methylproline Dmpro 

D- " -methylserine Dmser 

D" -methyl threonine Dmthr 

D-»-methyltryptophan Dmtrp 

D- " -methyl tyrosine Dmty 

15 D-"-methylvaline Dmval 

D-N-methylalanine Dnmala 

D-N-methylarginine Dnmarg 

D-N-methylasparagine Dnmasn 

D-N-methylaspartate Dnmasp 

20 D-N-methyl cysteine Dnmcys 

D-N-methylglutamine Dnmgln 

D-N-methylglutamate Dnmglu 

D-N-methylhistidine DnmhiB 

D-N-methylisoleucine Dnmile 

25 D-N-methyl leucine Dnmleu 

D-N-methyllysine Dnmlys 
N-methylcyclohexylalanine Nmchexa 

D-N-methylornithine Dnmorn 

NmcpenN-methylglycine Nala 

3 0 N-methylaminoisobutyrate Nmaib 

N- (1-methylpropyl) glycine Nile 
N- (2 -me thylpropyl) glycine Nleu 

D - N - methyl t ryp tophan Dnmt rp 

D-N-methyl tyrosine Dnmtyr 

3 5 D-N-methylvaline Dnmval 

{-aminobutyric acid Gabu 

L- e-butylglycine Tbug 



N- (4 -aminobutyl) glycine Nglu 
N- (2-aminoethyl) glycine Naeg 
N- ( 3 -aminopropyl) glycine Norn 
N-amino- " -methylbutyrate Nmaabu 
M -napthylalanine Anap 
N-benzylglycine Nphe 
N- (2-carbamylethyl) glycine Ngln 
N- (carbamylmethyl) glycine Nasn 
N- (2 -carboxyethyl) glycine Nglu 
N- (carboxymethyl ) glycine Nasp 
N-cyclobutylglycine Ncbut 
N-cycloheptylglycine Nchep 
N-cyclohexylglycine Nchex 
N-cyclodecylglycine Ncdec 
N-cylcododecylglycine Ncdod 
N- cyclooctylglycine Ncoct 
N- cyclopropylglycine Ncpro 
N - eye 1 oundecy Ig ly c i ne Ncund 
N- (2 , 2 -diphenylethyl) glycine Nbhm 
N- (3, 3 -diphenylpropyl) glycine Nbhe 
N- (3 -guanidinopropyl) glycine Narg 
N- (1 -hydroxy ethyl) glycine Nthr 
N- (hydroxyethyl > ) glycine Nser 
N- (imidazolylethyl) ) glycine Nhis 
N- (3-indolylyethyl)glycine Nhtrp 
N-methyl - ( -aminobutyrate Nmgabu 
D-N-methylmethionine Dnmmet 
N-methylcyclopentylalanine 
D-N-methylphenylalanine Dnmphe 
D-N-methylproline Dnmpro 
D-N-methylserine Dnmser 
D-N-methylthreonine Dnmthr 
N- (1-methylethyl) glycine Nval 
N-methy la -napthylalanine Nmanap 
N-methylpenicillamine Nmpen 
N- (p-hydroxyphenyl) glycine Nhtyr 
N- (thiomethyl) glycine Ncys 
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L- 


ethylglycine 


Etg 


penicillamine 


Pen 


L - homophe ny 1 a 1 a n i ne 


Hphe 


L- " -methylalanine 


Mala 


L- 


" -methylarginine 


Marg 


L- n -methylasparagine 


Masn 


L- 


" -methylaspartate 


Maep 


L- " -methyl - t-butylglycine 


Mtbug 


L- 


" -methylcysteine 


Mcys 


L-methylethylglycine 


Metg 


L- 


" -methylglutamine 


Mgln 


L- n -methylglutamate 


Mglu 


L- 


" -methylhistidine 


Mhis 


L-"-methylhomophenylalanine Mhphe 


L- 


** -methylisoleucine 


Mile 


N- (2 -methylthioethyl) glycine Nmet 


L- 


" -methylleucine 


Mleu 


L- " -methyllysine 


Mlys 


h~ 


" -methylmethionine 


Mmet 


L- " -methylnorleucine 


Mnle 


L~ 


" - me t hy 1 norva 1 ine 


Mnva 


L- tt -methylornithine 


Mom 


L- 


" -methylphenylalanine 


Mphe 


L- " -methylproline 


Mpro 


L- 


" -methylserine 


Mser 


L- » -methyl threonine 


Mthr 


L- 


" -methyltryptophan 


Mtrp 


L- " -methyl tyrosine 


Mtyr 


L- 


M -methylvaline 


Mval 


L-N-methylhomophenylalanin< 


e Nmhphe 



N- (N- (2,2-diphenylethyl) Nnbhm N- (N- (3 t 3-diphenylpropyl) Nnhhe 
carbamylmethyl) glycine c ar b amy 1 methyl) glycine 

l-carboxy-1- {2 # 2-diphenyl - Nmbc ethylamino) cyclopropane 



20 

The present invention further contemplates chemical 
analogues of NR6 capable of acting as antagonists or 
agonists of NR6 or which can act as functional analogues 
of NR6. Chemical analogues may not necessarily be 

25 derived from NR6 but may share certain conformational 

similarities. Alternatively, chemical analogues may be 
specifically designed to mimic certain physiochemical 
properties of NR6 . Chemical analogues may be chemically 
synthesised or may be detected following, for example, 

30 natural product screening. 

The identification of NR6 permits the generation of a 
range of therapeutic molecules capable of modulating 
expression of NR6 or modulating the activity of NR6 . 
35 Modulators contemplated by the present invention 

includes agonists and antagonists of NR6 expression. 
Antagonists of NR6 expression include antisense 
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molecules, ribozymes and co- suppression molecules. 
Agonists include molecules which increase promoter 
ability or interfere with negative regulatory 
mechanisms. Agonists of NRG include molecules which 
5 overcome any negative regulatory mechanism. Antagonists 
of NRG include antibodies and inhibitor peptide 
fragments . 

Other derivatives contemplated by the present invention 
10 include a range of glycosylation variants from a 
completely unglycosylated molecule to a modified 
glycosylated molecule. Altered glycosylation patterns 
may result from expression of recombinant molecules in 
different host cells. 

15 

Another embodiment of the present invention 
contemplates a method for modulating expression of NRG 
in a subject such as a human or mouse, said method 
comprising contacting the genetic sequence encoding NRG 

20 with an effective amount of a modulator of NR6 

expression for a time and under conditions sufficient to 
up-regulate or down-regulate or otherwise modulate 
expression of NR6. Modulating NR6 expression provides a 
means of modulating NRG-ligand interaction or NRG 

25 stimulation of cell activities. 

Another aspect of the present invention contemplates a 
method of modulating activity of NRG in a human, said 
method comprising administering to said mammal a 
modulating effective amount of a molecule for a time and 
under conditions sufficient to increase or decrease NR6 
activity. The molecule may be a proteinaceous molecule 
or a chemical entity and may also be a derivative of NR6 
or its ligand or a chemical analogue or truncation 
3 5 mutant of NRG or its ligand. 

The present invention, therefore, contemplates a 
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10 



pharmaceutical composition comprising NR6 or a 
derivative thereof or a modulator of NR6 expression or 
NR6 activity and one or more pharmaceutical^ acceptable 
carriers and/or diluents. These components are referred 
to as the Aactive ingredients®. 

The pharmaceutical forms suitable for injectable use 
include sterile aqueous solutions (where water soluble) 
and sterile powders for the extemporaneous preparation 
of sterile injectable solutions. It must be stable 
under the conditions of manufacture and storage and must 
be preserved against the contaminating action of 
microorganisms such as bacteria and fungi. The carrier 
can be a solvent or dilution medium comprising, for 
example, water, ethanol, polyol (for example, glycerol, 
propylene glycol and liquid polyethylene glycol, and the 
like), suitable mixtures thereof, and vegetable oils. 
The proper fluidity can be maintained, for example, by 
the use of superf actants . The preventions of the action 
of microorganisms can be brought about by various 
antibacterial and antifungal agents, for example, 
parabens, chlorobutanol , phenol, sorbic acid, 
thirmerosal and the like. In many cases, it will be 
preferable to include isotonic agents, for example, 
25 sugars or sodium chloride. Prolonged absorption of the 
injectable compositions can be brought about by the use 
in the compositions of agents delaying absorption, for 
example, aluminum monostearate and gelatin. 



15 



20 



30 



35 



Sterile injectable solutions are prepared by 
incorporating the active compounds in the required 
amount in the appropriate solvent with various of the 
other ingredients enumerated above, as required, 
followed by filtered sterilization. In the case of 
sterile powders for the preparation of sterile 
injectable solutions, the preferred methods of 
preparation are vacuum drying and the f reeze-drying 
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technique which yield a powder of the active ingredient 
plus any additional desired ingredient from previously 
sterile-filtered solution thereof. 



10 



When the active ingredients are suitably protected they 
may be orally administered, for example, with an inert 
diluent or with an assimilable edible carrier, or it may 
be enclosed in hard or soft shell gelatin capsule, or it 
may be compressed into tablets, or it may be 
incorporated directly with the food of the diet. For 
oral therapeutic administration, the active compound may 
be incorporated with excipients and used in the form of 
ingestible tablets, buccal tablets, troches, capsules, 
elixirs, suspensions, syrups, wafers, and the like. 
15 Such compositions and preparations should contain at 

least 1% by weight of active compound. The percentage 
of the compositions and preparations may, of course, be 
varied and may conveniently be between about 5 to about 
80% of the weight of the unit. The amount of active 
20 compound in such therapeutically useful compositions in 
such that a suitable dosage will be obtained. Preferred 
compositions or preparations according to the present 
invention are prepared so that an oral dosage unit form 
contains between about 0.1 ug and 2000 mg of active 
25 compound. Alternative dosage amounts include from about 
1 Fg to about 1000 mg and from about 10 Fg to about 500 
mg. 



The tablets, troches, pills, capsules and the like may 
also contain the components as listed hereafter: A 
binder such as gum, acacia, corn starch or gelatin; 
excipients such as dicalcium phosphate; a 
disintegrating agent such as corn starch, potato starch, 
alginic acid and the like; a lubricant such as 
magnesium stearate; and a sweetening agent such a 
sucrose, lactose or saccharin may be added or a 
flavouring agent such as peppermint, oil of wintergreen, 
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or cherry flavouring. When the dosage unit form is a 
capsule, it may contain, in addition to materials of the 
above type, a liquid carrier. Various other materials 
may be present as coatings or to otherwise modify the 
5 physical form of the dosage unit. For instance, 

tablets, pills, or capsules may be coated with shellac, 
sugar or both. A syrup or elixir may contain the active 
compound, sucrose as a sweetening agent, methyl and 
propylparabens as preservatives, a dye and flavouring 

10 such as cherry or orange flavour. Of course, any 

material used in preparing any dosage unit form should 
be pharmaceutical^ pure and substantially non-toxic in 
the amounts employed. In addition, the active 
compound (s) may be incorporated into sustained-release 

15 preparations and formulations. 

The present invention also extends to forms suitable for 
topical application such as creams, lotions and gels as 
well as a range of "paints" which are applied to skin 
and through which the active ingredients are absorbed. 

Pharmaceutical^ acceptable carriers and/or diluents 
include any and all solvents, dispersion media, 
coatings, antibacterial and antifungal agents, isotonic 
and absorption delaying c agents and the like. The use of 
such media and agents for pharmaceutical active 
substances is well known in the art and except insofar 
as any conventional media or agent is incompatible with 
the active ingredient, their use in the therapeutic 
compositions is contemplated. Supplementary active 
ingredients can also be incorporated into the 
compositions. 

It is especially advantageous to formulate parenteral 
compositions in dosage unit form for ease of 
administration and uniformity of dosage. Dosage unit 
form as used herein refers to physically discrete units 
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suited as unitary dosages for the mammalian subjects to 
be treated; each unit containing a predetermined 
quantity of active material calculated to produce the 
desired therapeutic effect in association with the 
5 required pharmaceutical carrier. The specification for 
the novel dosage unit forms of the invention are 
dictated by and directly dependent on (a) the unique 
characteristics of the active material and the 
particular therapeutic effect to be achieved, and <b) 
10 the limitations inherent in the art of compounding such 
an active material for the treatment of disease in 
living subjects having a diseased condition in which 
bodily health is impaired as herein disclosed in detail. 

15 The principal active ingredient is compounded for 

convenient and effective administration in effective 
amounts with a suitable pharmaceutically acceptable 
carrier in dosage unit form as hereinbefore disclosed. 
A unit dosage form can, for example, contain the 

20 principal active compound in amounts ranging from 0.5 :g 
to about 2000 mg. Expressed in proportions, the active 
compound is generally present in from about 0.5 :g to 
about 2000 mg/ml of carrier. In the case of 
compositions containing supplementary active 

25 ingredients, the dosages are determined by reference to 
the usual dose and manner of administration of the said 
ingredients . 

Dosages may also be expressed per body weight of the 
recipient. For example, from about 10 ng to about 1000 
30 mg/kg body weight, from about 100 ng to about 500 mg/kg 
body weight and for about 1 Fg to above 250 mg/kg body 
weight may be administered. 

The pharmaceutical composition may also comprise genetic 
35 molecules such as a vector capable of transfecting 

target cells where the vector carries a nucleic acid 
molecule capable of modulating NRG expression or NR6 
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activity. The vector may, for example, be a viral 
vector. 

Still another aspect of the present invention is 
5 directed to antibodies to NR6 and its derivatives. Such 
antibodies may be monoclonal or polyclonal and may be 
selected from naturally occurring antibodies to NR6 or 
may be specifically raised to NR6 or derivatives 
thereof. In the case of the latter, NR6 or its 

10 derivatives may first need to be associated with a 

carrier molecule. The antibodies and/or recombinant NR6 
or its derivatives of the present invention are 
particularly useful as therapeutic or diagnostic agents. 
For example, NR6 antibodies or antibodies to its ligand 

15 may act as antagonists. 

For example, NR6 and its derivatives can be used to 
screen for naturally occurring antibodies to NR6 . These 
may occur, for example in some autoimmune diseases. 

20 Alternatively, specific antibodies can be used to screen 
for NR6 . Techniques for such assays are well known in 
the art and include, for example, sandwich assays and 
ELISA. Knowledge of NR6 levels may be important for 
diagnosis of certain cancers or a predisposition to 

25 cancers or for monitoring certain therapeutic protocols. 

Antibodies to NRG of the present invention may be 
monoclonal or polyclonal. Alternatively, fragments of 
antibodies may be used such as Fab fragments. 

30 Furthermore, the present invention extends to 

recombinant and synthetic antibodies and to antibody 
hybrids. A "synthetic antibody" is considered herein to 
include fragments and hybrids of antibodies. The 
antibodies of this aspect of the present invention are 

35 particularly useful for immunotherapy and may also be 
used as a diagnostic tool for assessing apoptosis or 
monitoring the program of a therapeutic regimen. 
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For example, specific antibodies can be used to screen 
for NR6 proteins. The latter would be important, for 
example, as a means for screening for levels of NR6 in a 
cell extract or other biological fluid or purifying NR6 
5 made by recombinant means from culture supernatant 

fluid. Techniques for the assays contemplated herein 
are known in the art and include, for example, sandwich 
assays and EL ISA. 



10 



It is within the scope of this invention to include any 
second antibodies (monoclonal, polyclonal or fragments 
of antibodies or synthetic antibodies) directed to the 
first mentioned antibodies discussed above. Both the 
first and second antibodies may be used in detection 
15 assays or a first antibody may be used with a 

commercially available anti -immunoglobulin antibody. An 
antibody as contemplated herein includes any antibody 
specific to any region of NR6. 



20 



25 



30 



35 



Both polyclonal and monoclonal antibodies are obtainable 
by immunization with the enzyme or protein and either 
type is utilizable for immunoassays. The methods of 
obtaining both types of sera are well known in the art. 
Polyclonal sera are less preferred but are relatively 
easily prepared by injection of a suitable laboratory 
animal with an effective amount of NR6, or antigenic 
parts thereof, collecting serum from the animal, and 
isolating specific sera by any of the known 
immunoadsorbent techniques. Although antibodies 
produced by this method are utilizable in virtually any 
type of immunoassay, they are generally less favoured 
because of the potential heterogeneity of the product. 

The use of monoclonal antibodies in an immunoassay is 
particularly preferred because of the ability to produce 
them in large quantities and the homogeneity of the 
product. The preparation of hybridoma cell lines for 
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monoclonal antibody production derived by fusing an 
immortal cell line and lymphocytes sensitized against 
the immunogenic preparation can be done by techniques 
which are well known to those who are skilled in the 
5 art. 

Another aspect of the present invention contemplates a 
method for detecting NR6 in a biological sample from a 
subject said method comprising contacting said 
10 biological sample with an antibody specific for NR6 or 
its derivatives or homologues for a time and under 
conditions sufficient for an antibody-NR6 complex to 
form, and then detecting said complex. 

The presence of NR6 may be accomplished in a number of 
15 ways such as by Western blotting and ELISA procedures. 

A wide range of immunoassay techniques are available as 
can be seen by reference to US Patent Nos. 4,016,043, 4, 
424,279 and 4,018,653. These, of course, includes both 
single-site and two-site or "sandwich" assays of the 
20 non-competitive types, as well as in the traditional 

competitive binding assays. These assays also include 
direct binding of a labelled antibody to a target. 

Sandwich assays are among the most useful and commonly 
25 used assays and are favoured for use in the present 

invention. A number of variations of the sandwich assay 
technique exist, and all are intended to be encompassed 
by the present invention. Briefly, in a typical forward 
assay, an unlabelled antibody is immobilized on a solid 
30 substrate and the sample to be tested brought into 
contact with the bound molecule. After a suitable 
period of incubation, for a period of time sufficient to 
allow formation of an antibody- antigen complex, a second 
antibody specific to the antigen, labelled with a 
35 reporter molecule capable of producing a detectable 
signal is then added and incubated, allowing time 
sufficient for the formation of another complex of 
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antibody-antigen-labelled antibody. Any unreacted 
material is washed away, and the presence of the antigen 
is determined by observation of a signal produced by the 
reporter molecule. The results may either be 
qualitative, by simple observation of the visible 
signal, or may be quantitated by comparing with a 
control sample containing known amounts of hapten. 
Variations on the forward assay include a simultaneous 
assay, in which both sample and labelled antibody are 
added simultaneously to the bound antibody. These 
techniques are well known to those skilled in the art, 
including any minor variations as will be readily 
apparent. In accordance with the present invention, the 
sample is one which might contain NR6 including cell 
extract, tissue biopsy or possibly serum, saliva, 
mucosal secretions, lymph, tissue fluid and respiratory 
fluid. The sample is, therefore, generally a biological 
sample comprising biological fluid but also extends to 
fermentation fluid and supernatant fluid such as from a 
cell culture. 

In the typical forward sandwich assay, a first antibody 
having specificity for the NR6 or antigenic parts 
thereof, is either covalently or passively bound to a 
25 solid surface. The solid surface is typically glass or 
a polymer, the most commonly used polymers being 
cellulose, polyacrylamide, nylon, polystyrene, polyvinyl 
chloride or polypropylene. The solid supports may be in 
the form of tubes, beads, discs of microplates, or any 
30 other surface suitable for conducting an immunoassay. 
The binding processes are well-known in the art and 
generally consist of cross-linking covalently binding or 
physically adsorbing, the polymer-antibody complex is 
washed in preparation for the test sample. An aliquot 
35 of the sample to be tested is then added to the solid 
phase complex and incubated for a period of time 
sufficient (e.g. 2-40 minutes or overnight if more 
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convenient) and under suitable conditions (e.g. from 
about room temperature to about 371C) to allow binding 
of any subunit present in the antibody. Following the 
incubation period, the antibody subunit solid phase is 
5 washed and dried and incubated with a second antibody 
specific for a portion of the hapten. The second 
antibody is linked to a reporter molecule which is used 
to indicate the binding of the second antibody to the 
hapten. 

10 

An alternative method involves immobilizing the target 
molecules in the biological sample and then exposing the 
immobilized target to specific antibody which may or may 
not be labelled with a reporter molecule. Depending on 

15 the amount of target and the strength of the reporter 
molecule signal, a bound target may be detectable by 
direct labelling with the antibody. Alternatively, a 
second labelled antibody, specific to the first antibody 
is exposed to the target -first antibody complex to form 

20 a target-first antibody- second antibody tertiary 

complex. The complex is detected by the signal emitted 
by the reporter molecule. 

In another alternative method, the NR6 ligand is 
25 immobilised to a solid support and a biological sample 

containing NR6 brought into contact with its immobilised 
ligand. Binding between NR5 and its ligand can then be 
determined using an antibody to NR6 which itself may be 
labelled with a reporter molecule or a further anti- 
30 immunoglobulin antibody labelled with a reporter 

molecule could be used to detect antibody bound to NR6. 

By "reporter molecule" as used in the present 
specification, is meant a molecule which, by its 
35 chemical nature, provides an analytically identifiable 
signal which allows the detection of antigen-bound 
antibody. Detection may be either qualitative or 
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quantitative. The most commonly used reporter molecules 
in this type of assay are either enzymes, fluorophores 
or radionuclide containing molecules (i.e. 
radioisotopes) and chemi luminescent molecules. 
5 In the case of an enzyme immunoassay, an enzyme is 

conjugated to the second antibody, generally by means of 
glutaraldehyde or periodate. As will be readily 
recognized, however, a wide variety of different 
conjugation techniques exist, which are readily 

10 available to the skilled artisan. Commonly used enzymes 
include horseradish peroxidase, glucose oxidase, beta- 
galactosidase and alkaline phosphatase, amongst others. 
The substrates to be used with the specific enzymes are 
generally chosen for the production, upon hydrolysis by 

15 the corresponding enzyme, of a detectable colour change. 
Examples of suitable enzymes include alkaline 
phosphatase and peroxidase. It is also possible to 
employ fluorogenic substrates, which yield a fluorescent 
product rather than the chromogenic substrates noted 

20 above. In all cases, the enzyme-labelled antibody is 
added to the first antibody hapten complex, allowed to 
bind, and then the excess reagent is washed away. A 
solution containing the appropriate substrate is then 
added to the complex of antibody-antigen-antibody. The 

25 substrate will react with the enzyme linked to the 

second antibody, giving a qualitative visual signal, 
which may be further quant itated, usually 
spectrophotometrically, to give an indication of the 
amount of hapten which was present in the sample. 

30 "Reporter molecule" also extends to use of cell 

agglutination or inhibition of agglutination such as red 
blood cells on latex beads, and the like. 

Alternately, fluorescent compounds, such as fluorescein 
35 and rhodamine, may be chemically coupled to antibodies 

without altering their binding capacity. When activated 
by illumination with light of a particular wavelength, 
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the f luorochrome- labelled antibody adsorbs the light 
energy, inducing a state to excitability in the 
molecule, followed by emission of the light at a 
characteristic colour visually detectable with a light 
5 microscope. As in the EIA, the fluorescent labelled 

antibody is allowed to bind to the first antibody-hapten 
complex. After washing off the unbound reagent, the 
remaining tertiary complex is then exposed to the light 
of the appropriate wavelength the fluorescence observed 

10 indicates the presence of the hapten of interest. 

Immunofluorescene and EIA techniques are both very well 
established in the art and are particularly preferred 
for the present method. However, other reporter 
molecules, such as radioisotope , chemi luminescent or 

15 bioluminescent molecules, may also be employed. 

The present invention also contemplates genetic assays 
such as involving PCR analysis to detect the NR6 gene or 
its derivatives. Alternative methods or methods used in 
20 conjunction include direct nucleotide sequencing or 

mutation scanning such as single stranded conformational 
polymorphisms analysis (SSCP) as specific 
oligonucleotide hybridisation, as methods such as direct 
protein truncation tests. 

25 

The nucleic acid molecules of the present invention may 
be DNA or RNA. When the nucleic acid molecule is in a 
DNA form, it may be genomic DNA or cDNA. RNA forms of 
the nucleic acid molecules of the present invention are 
30 generally mRNA. 

Although the nucleic acid molecules of the present 
invention are generally in isolated form, they may be 
integrated into or ligated to or otherwise fused or 
35 associated with other genetic molecules such as vector 

molecules and in particular expression vector molecules. 
Vectors and expression vectors are generally capable of 
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replicat ion and, if applicable, expression in one or 
both of a prokaryotic cell or a eukaryotic cell. 
Preferably, prokaryotic cells include E. coli, Bacillus 
sp and Pseudomonas sp. Preferred eukaryotic cells 
5 include yeast , fungal, mammalian and insect cells. 

Accordingly, another aspect of the present invention 
contemplates a genetic construct comprising a vector 
portion and a mammalian and more particularly a human 
10 NR6 gene portion, which NR6 gene portion is capable of 
encoding an NR6 polypeptide or a functional or 
immunologically interactive derivative thereof. 

Preferably, the NR6 gene portion of the genetic 
15 construct is operably linked to a promoter on the vector 
such that said promoter is capable of directing 
expression of said NR6 gene portion in an appropriate 
cell. 

20 In addition, the NR6 gene portion of the genetic 

construct may comprise all or part of the gene fused to 
another genetic sequence such as a nucleotide sequence 
encoding maltose binding protein or glutathione-S- 
transf erase or part thereof. 

25 

The present invention extends to such genetic constructs 
and to prokaryotic or eukaryotic cells comprising same. 

The present invention also extends to any or all 
30 derivatives of NRG including mutants, part, fragments, 
portions, homologues and analogues or their encoding 
genetic sequence including single or multiple nucleotide 
or amino acid substitutions, additions and/or deletions 
to the naturally occurring nucleotide or amino acid 
35 sequence. 

NR6 may be important for the proliferation, 
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differentiation and survival of a diverse array of cell 
types. Accordingly, it is proposed that NR6 or its 
functional derivatives be used to regulate development, 
maintenance or regeneration in an array of different 
5 cells and tissues in vitro and in vivo. For example, 

NRG is contemplated to be useful in modulating neuronal 
proliferation, dif f erentation and survival. 

Soluble NR6 polypeptides are also contemplated to be 
10 useful in the treatment of a range of diseases, injuries 
or abnormalities. 

Membrane bound or soluble NR6 may be used in vitro on 
nerve cells or tissues to modulate proliferation, 
15 differentiation or survival, for example, in grafting 
procedures or transplantation. 

As stated above, the NR6 of the present invention or its 
functional derivatives may be provided in a 

20 pharmaceutical composition comprising the NR6 together 
with one or more pharmaceutical ly acceptable carriers 
and/or diluents. In addition, the present invention 
contemplates a method of treatment comprising the 
administration of an effective amount of a NR6 of the 

25 present invention. The present invention also extends 
to antagonists and agonists of NR6s and their use in 
therapeutic compositions and methodologies. 

A further aspect of the present invention contemplates 
30 the use of NR6 or its functional derivatives in the 
manufacture of a medicament for the treatment of NR6 
mediated conditions defective or deficient. 

Still a further aspect of the present invention 
35 contemplates a ligand for NR6 preferably, in isolated or 
recombinant form or a derivative of said ligand. 
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The present invention further contemplates knockout 
animals such as mice or other murine species for the NRG 
gene including homozygous and heterozygous knockout 
animals. Such animals provide a particularly useful 
live in vivo model for studying the effects of NR6 as 
well as screening for agents capable of acting as 
agonists or antagonists of NRG. 



According to this embodiment there is provided a 
10 transgenic animal comprising a mutation in at least one 
allele of the gene encoding NRG. Additionally, the 
present invention provides a transgenic animal 
comprising a mutation in two alleles of the gene 
encoding NRG. Preferably, the transgenic animal is a 
15 murine animal such as a mouse or rat. 

The present invention is further described by the 
following non-limiting Figures and Examples. 

20 In the Figures: 

Figure 1 is a diagrammatic representation showing 
expansion of sequenced region of the mouse NR6 gene 
indicating splicing patterns seen in the three forms of 
25 NR6 cDNA, NR6 . 1 , NR6 . 2 and NRG . 3 . 



Figure 2 is a representation of the nucleotide sequence 
of the mouse NRG gene, containing exons encoding the 
cDNA from nucleotide 148 encoding D50 of the cDNAs shown 
in SEQ ID NOs:12 and 14 to the end of the 3N 
untranslated region shared by both NR6.1, NR6.2 and 
NRG. 3. In this figure, this region encompasses 
nucleotides gll82 to g6G17. This sequence is also 
defined in SEQ ID NO: 28. 

Figure 3 is a representation of the nucleotide sequence 
of the mouse genomic NRG gene with additional 5N 
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sequences. The coding exons of NR6 span approximately 



10 



llkb of the mouse genome. 

separated by 8 introns: 

exonl at least 239nt 

exon 2 282nt 

exon3 13 0nt 

exon4 170nt 

exonS 158nt 

exon6 169nt 

exon 6 188nt 

exonS 43nt 

exon9 252nt 



There are 9 coding exons 

intronl 5195nt 
intron2 214nt 
intron3 107nt 
intron4 1372nt 
intronS 68nt 
intron6 2 02 0nt 
intron7 104nt 
intron8 181nt 



Exon 1 encoding the signal sequence, exon 2 the Ig-like 
domain, exons 3 to 6 the hemopoietin domain. Exons 7, 8 
and 9 are alternatively spliced. 

Figure 4 is a diagrammatic representation showing .the 
genomic structure of murine NR-6. 

Figure 5 is a diagrammatic representation showing 
targetting of the NR6 locus by homologous recombination. 



- 33 - 



SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



Single and three 


letter abbreviations 


for amino acid 


residues used in 


the specification are 


summarised in 


Table 2: 








TABLE 2 




Amino Acid 


Three-letter 


One-letter 




Abbreviation 


Symbol 


Alanine 


Ala 


A 


Arginine 


Arg 


R 


Asparagine 


Asn 


N 


Aspartic acid 


Asp 


D 


Cysteine 


Cys 


C 


Glut amine 


Gin 


Q 


Glutamic acid 


Glu 


E 


Glycine 


Gly 


G 


Histidine 


His 


H 


Isoleucine 


He 


I 


Leucine 


Leu 


L 


Lysine 


Lys 


K 


Methionine 


Met 


M 


Phenyl alanine 


Phe 


F 


Proline 


Pro 


P 


Serine 


Ser 


S 


Threonine 


Thr 


T 


Tryptophan 


Trp 


W 


Tyrosine 


Tyr 


Y 


Valine 


Val 


V 


Any residue 


Xaa 


X 
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Sequence SEQ ID NO. 

5 Amino acid sequence WSXWS i 
Oligonucleotide primers and probes listed 

in Example 1 2-11 
Nucleotide sequence of NR6.1 1 12 
Amino acid sequence of NR6.1 13 
10 Nucleotide sequence of NR6.2 2 14 
Amino acid sequence of NR6.2 15 
Nucleotide sequence of NR6.3 3 \s 
Amino acid sequence of NR6.3 17 
Nucleotide sequence of products generated 
15 by 5N RACE of brain cDNA using NR6 

specific primers 4 18 
Amino acid sequence of SEQ ID NO: 18 19 
Nucleotide sequence unique to 5N RACE of 

brain cDNA 20 

20 Amino acid sequence for SEQ ID NO: 20 21 

Unspliced murine NR6 nucleotide sequence 22 

PCR product for human NR6 23 
Nucleotide sequence of clone HFK- 66 

encoding human NR6 24 

IS Amino acid sequence of SEQ ID NO: 24 25 
Oligonucleotide sequences UP1 and LP1, 

respectively 26-27 

Genomic nucleotide sequence of murine NR6 28 

Amino acid sequence of SEQ ID NO: 28 29 

0 Murine NR6 . 1 oligonucleotide primers 30, 31 

Murine IL-3 signal sequence 32 
Linker sequence for mouse IL-3 signal 

sequence and FLAG epitope 33-35 
Genomic nucleotide sequence of murine NR6 

5 containing additonal 5N sequence 33 

Oligonucleotide 2199 and 2200, respectively 36, 37 

N- terminal region of NR6 39 
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*The polyadenylation signal AATAAATAAA is at nucleotide 
position 1451 to 1460; NR6 . 1 (SEQ ID NO: 12) and NR6.2 
(SEQ ID NO: 14) are identical to nucleotide 1223 encoding 
Q4 07, the represents the end of an exon. NR6 . l splices 
5 out an exon present only in NRG . 2 and uses a different 
reading frame for the final exon which is shared with 
NR6.2; this corresponds to amino acids VLPAKL at amino 
acid residue positions 408-413. The region of 3N- 
untranslated DNA shared by NR6.1, NR6.2 and NR6.3 is 
10 from nucleotide 1240 to 1475. The WSXWS motif is at 
amino acid residues 330 to 334. 

2 The polyadenylation signal AATAAA is at nucleotide 
positions 1494 to 1503. The WSXWS motif is at amino 

15 acid residues 330 to 334. NR6.1 and NR6.2 are identical 
to nucleotide 1223 encoding Q407 which represents the 
end of an exon. NR6.2 splices in an exon beginning at 
amino acid residue D4 08, nucleotide 1224 and ends at 
residue G422, nucleotide 1264. The region of 3N 

20 untranslated DNA shared by NR6.1, NR6.2 and NR6.3 is 
from nucleotide position 1283 to 1517. 

3 The nucleotide and amino acid numbering corresponds to 
SEQ ID NO:12 and 14. The WSXWS motif is at amino acid 

25 residues 330 to 334. The polyadenylation signal 

AATAAATAAA is from nucleotide 1781 to 1780. NR6.1, 
NR6.2 and NR6.3 are identical to nucleotide 1223 
encoding Q407, this represents the end of an exon. 
NR6.3 fails to splice from this position and, therefore, 

30 translation continues through the intron, giving rise to 
the C-terminal protein region from amino acid residues 
4 08 to 461. The region of 3N untranslated DNA shared by 
NR6.1, NR6.2 and NR6 . 3 is from nucleotide 1469 to 1804. 

3 5 4 The nucleotide sequence is identical to NR6.1, NR6.2 

and NR6.3 from nucleotide C151, the first nucleotide for 
Pro51. The numbering from this nucleotide is the same 
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as for SEQ ID NO: 14 and 16. The 5N of this point is 
unique to the products generated by 5N RACE not being 
found in NR6.1, NR6.2 and NR6.3 and is represented in 
SEQ ID NOs:20 and 21. 

5 

Structure of the murine genomic NR6 locus. The coding 
exons of NR6 span approximately llkb of the mouse 
genome. There are 9 coding exons separated by 8 
introns : 

10 

exon 1 at least 239nt intronl 5195nt 
exon 2 282nt intron2 214nt 

exon 3 130nt intron3 107nt 

exon 4 170nt intron 4 1372nt 

15 exon 5 158nt intronS 68nt 

exon 6 169nt intron6 2020nt 

exon 7 188nt intron7 l04nt 

exon 8 43nt intronS 181nt 

exon 9 252nt 

20 

Exon 1 encodes the signal sequence, exon 2 the Ig-like 
domain, exons 3 to 6 the hemopoietin domain. Exons 7, 8 
and 9 are alternatively spliced. 

25 The NRG molecules of the present invention have a range of 
utilities referred to in the subject specification. 
Additional utilities include: 

1. Identification of molecules that interact with NR6 . 
30 These may include : 

a) a corresponding ligand using standard orphan receptor 
techniques (26) , 

35 b) monoclonal antibodies that act either as receptors 
antagonists or agoniBts, 
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c) mimetic or antagonistic peptides isolated using phage 
display technology (27,28), 

d) small molecule natural products that act either as 
5 antagonists or agonists. 

2. Development of diagnostics to detect 
deletions /rearrangements in the NR6 gene. 

The NR6 knock-out mice studies described herein provide a 
10 useful model for this utility. There are also applications 
in the field of reproduction. For example, people can be 
tested for their NR6 status. NR6 +/- carriers might be 
expected to give rise to offspring with developmental 
problems . 
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EXAMPLE 1 
Oligonucleotides 



10 



M116 

M108 

M159 

M242 

M112 

WSDWS 

WSEWS 

1944 

2106 

2120 



5 1 ACTCGCTCCAGATTCCCGCCTTTT 3 

5' TCCCGCCTTTTTCGACCCATAGAT 3 

5' GGTACTTGGCTTGGAAGAGGAAAT 3 

5' CGGCTCACGTGCACGTCGGGTGGG 3 

5 ' AGCTGCTGTTAAAGGGCTTCTC 3 ' 

5' (A/G)CTCCA(A/G)TC(A/G)CTCCA 3 

5- (A/G)CTCCA(C/T)TC(A/G)CTCCA 3 



[SEQ ID NO: 2) 
(SEQ ID NO: 3] 
[SEQ ID NO: 4] 
[SEQ ID NO: 5] 
[SEQ ID NO: 6] 

[SEQ ID NO: 7] 
[SEQ ID NO: 8] 



5' AAGTGTGACCATCATGTGGAC 3' [SEQ ID NO: 9] 
5' GGAGGTGTTAAGGAGGCG 3' [SEQ ID NO: 10] 
5' ATGCCCG CGGGTCGCCCG 3' [SEQ ID NO: 11] 



15 



20 



25 



30 



35 



EXAMPLE 2 

Isolation of initial NR6 cDNA clones using 
oligonucleotides designed against the conserved WSXWS 
motif found in members of the haemopoietin receptor 
family 

(i) A commercial adult mouse testis cDNA library cloned 
into the UNI-ZAP bacteriophage (Stratagene, CA, USA; 
Catalogue numbers 937 308) was used to infect 
Escherichia coli of the strain LE392. Infected bacteria 
were grown on twenty 150 mm agar plates, to give 
approximately 50,000 plaques per plate. Plaques were 
then transferred to duplicate 150 mm diameter nylon 
membranes (Colony/Plaque Screen, NEN Research Products, 
MA, USA) , bacteria were lysed and the DNA was denatured 
and fixed by autoclaving at 100°C for 1 min with dry 
exhaust. The filters were rinsed twice in 0.1% (w/v) 
sodium dodecyl sulfate (SDS) , 0.1 x SSC (SSC is 150 mM 
sodium chloride, 15 mM sodium citrate dihydrate) at room 
temperature and pre-hybridized overnight at 42°C in 6 x 
SSC containing 2 mg/ml bovine serum albumin, 2 mg/ml 
Ficoll, 2 mg/ml polyvinylpyrrolidone, 100 mM ATP, 10 
mg/ml tRNA, 2 mM sodium pyrophosphate, 2 mg/ml salmon 
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sperm DNA, 0.1% (w/v) SDS and 200 mg/ml sodium azide. 
The pre -hybridisation buffer was remold. 1.2 Fg of the 
degenerate oligonucleotides for hybridization (WSDWS; 
Example 1) were phosphorylated with T4 polynucleotide 
5 kinase using 960 mCi of y 32 P-ATP (Bresatec, S.A., 

Australia) . Unincorporated ATP was separated from the 
labelled oligonucleotide using a pre-packed gel 
filtration column (NAP- 5; Pharmacia, Uppsala, Sweden) . 
Filters were hybridized overnight at 42°C in 80 ml of 

10 the prehybridisation buffer containing 0.1% (w/v) SDS, 
rather than NP40, and 10 6 - 10 7 cpm/ml of labelled 
oligonucleotide. Filters were briefly rinsed twice at 
room temperature in 6 x SSC, 0.1% (v/v) SDS, twice for 30 
min at 4 5°C in a shaking waterbath containing 1.5 1 of 

15 the same buff er and then briefly in 6 x SSC at room 

temperature. Filters were then blotted dry and exposed 
to autoradiographic film at -70°C using intensifying 
screens, for 7-14 days prior to development. 
Plaques that appeared positive on orientated duplicate 

20 filters were -picked, eluted in 1 ml of 100 mM NaCl, 10 
mM MgCl2, 10 mM Tris.HCl pH7.4 containing 0.5% (w/v) 
gelatin and 0.5% (v/v) chloroform and stored at 4°C. 
After 2 days LE3 92 cells were infected with the eluate 
from the primary plugs and replated for the secondary 

25 screen. This process was repeated until hybridizing 
plagues were pure. 

Once purified, positive cDNAs were excised from the ZAP 
II bacteriophage according to the manufacturer's 

30 instructions (Stratagene, CA, USA) and cloned into the 

plasmid pBluescript. A CsCl purified preparation of the 
DNA was made and this was sequenced on both strands. 
Sequencing was performed using an Applied Biosys terns 
automated DNA sequencer, with fluorescent 

35 dideoxynucleotide analogues according to the 

manufacturer's instructions. The DNA sequence was 
analysed using software supplied by Applied Biosystems. 
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Two clones isolated from Che mouse testis cDNA library 
shared large regions of nucleotide sequence identity 68- 
1 and 68-2 and appeared to encode a novel member of the 
haemopoietin receptor family and the inventors gave the 
.5 putative receptor the working name "NR6". 

(ii) In a parallel series of experiments, a commercial 
mouse brain cDNA library (STRATAGENE #967319, Balb/c 
day-20, whole brain cDNA/Uni-ZAP XR Vector) was used to 

10 infect E.coli strain XLl-Blue MRF= . Infected bacteria 
were grown on 90x13 5mm square agar plates to give about 
25,000 plaques per plate. Plaques were then transferred 
to positively charged nylon membranes, Hybond-N(+) 
(Amersham RPN 203B) , bacteria were lysed and the DNA was 

15 denatured with denaturing 0.5 M NaOH, 1.5 M NaCl at room 
temperature for 7 min. The membranes were neutralized 
with 0.5 M Tris-HCL pH7.2, 1.5 M NaCl, 1 mM EDTA at room 
temperature for 10 min before the DNA fixation by UV 
crossl inking. 



20 



25 



30 



35 



A mixture of WSDWS and WSEWS oligonucleotide probes (SEQ 
ID NOs: 7 and 8) were labelled with a f- 32 P]-ATP 
(T0Y0B0 #PNK-104 Kination kit) . The membranes from the 
mouse brain cDNA library were then hybridized with the 
mixture of WSDWS and WSEWS oligonucleotide probes in the 
Rapid Hybridization Buffer (Amersham, RPN1636) at 42°C 
for 16 hours. Filters were washed with lxSSC/0.1% (w/v) 
SDS at 42°C before autoradiography. Plaques that 
appeared positive on orientated duplicate filters were 
picked and replated on E. coli, XLl-Blue MRFN with the 
process of immobilisation on nylon membranes, 
hybridization of membranes with oligonucleotide probes, 
washing and autoradiography repeated until pure plaques 
had been obtained. 

The cDNA fragment from pure positively hybridizing 
plaques was isolated by excision with the helper phage 
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strain ExAssist according to the manufacturers 
instructions (Stratagene, #967319) . Sequencing was 
performed after the amplification with Ampli-Taq DNA 
polymerase and Taq dideoxy terminator cycle sequencing 
5 kit (Perkin Elmer, #401150) by 25 cycles of 9G°C for 10 
sec, 50°C for 5 sec, 60°C for 4 min followed by 60°C for 
5 min with the sequencing primers on an ABI model 377 
DNA sequencer. 

10 One clone, MBC-8, from the mouse brain library shared 

large regions of nucleotide sequence identity with both 
the 68-1 and 68-2 clones isolated from the mouse testis 
cDNA library. 

15 (iii) In a third series of experiments, total RNA was 
prepared from the mouse osteoblastic cell line, KUSA, 
according to the method of Chirgwin et al. (15), and 
poly(A)+RNA was further purified by oligo (dT) -cellulose 
chromatography (Pharmacia Biotech) . Complementary DNA 

20 was synthesized by oligo (dT) priming, inserted into the 
UniZAP XR directional cloning vector (Stratagene) , and 
packaged into 8 phage using Gigapack Gold (Stratagene) , 
yielding 1.25 x 10 7 independent clones. 

25 Approximately 10 6 clones were screened essentially as 
described in (ii) above. Briefly, probes were labeled 
with 32 P using T4 polynucleotide kinase and 
prehybridization was performed for 4 hr in the Rapid 
hybridization buffer (Amersham LIFE SCIENCE) at 42°C. 

30 Filters (Hybond N+, Amersham) were then hybridized for 

19 hr under the same condition with the addition of 32 P- 
labeled WSXWS mix oligonucleotides and washed 3 times. 
The final wash was for 30 min in 1 x SSPE, 0.1% (w/v) 
SDS at 42°C. Filters were then exposed with an 

35 intensifying screen to Kodak X-OMAT AR film for 5 days. 

Isolated clones were subjected to the in vivo excision 
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of pBluescript SK(-) phagemid (Stratagene) , and plasmid 
DNA was prepared by the standard method. DNA sequences 
were determined using an ABI PRISM 377 DNA Sequencer 
(Perkin Elmer) with appropriate synthetic 
5 oligonucleotide primers. A clone pKUSA166 shared large 
regions of nucleotide sequence identity with the MBC-8, 
68-1 and 68-2 clones isolated from the mouse brain and 
testis cDNA libraries. 

10 EXAMPLE 3 

Isolation of further NR6 cDNA clones using probes 
specific for NR6 



(i) In order to identify other cDNA libraries 

15 containing cDNA clones for NR6, the inventors performed 
PCR upon 1 /il aliquots of A-bacteriophage cDNA libraries 
made from mRNA from various human tissues and using 
oligonucleotides 2070 and 2057, designed from the 
sequence of 68-1 and 68-2, as primers. Reactions 

20 contained 5 pi of 10 x concentrated PCR buffer 

, (Boehringer Mannheim GmbH, Mannheim, Germany) , 1 /xl of 
10 mM dATP, dCTP, dGTP and dTTP, 2.5 /xl of the 
oligonucleotides HYB2 and either T3 or T7 at a 
concentration of 100 mg/ml, 0.5 /xl of Taq polymerase 

25 (Boehringer Mannheim GmbH) and water to a final volume 

of 50 /xl. PCR was carried out in a Perkin-Elmer 9600 by 
heating the reactions to 96°C for 2 min and then for 25 
cycles at 96°C for 30 sec, 55°C for 30 sec and 72°C for 
2 min. PCR products were resolved on an agarose gel, 

30 immobilized on a nylon membrane and hybridized with 32p_ 
labelled oligonucleotide 1943 (SEQ ID NO: 42) . 



In addition to the original library, a mouse brain cDNA 
library appeared to contain NR6 cDNAs . These were 
35 screened using a 32 P-labelled oligonucleotides 1944, 
2106, 2120 (Example 1) or with a fragment of the 
original NR6 cDNA clone from 68-1 (nucleotide 934 to the 
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end of NR6.1 in Figure 1) labelled with 32p US i ng a 
random decanucleotide labelling kit (Bresatec) . 
Conditions used were similar to those described in (i) 
above except that for the labelled oligonucleotides, 
5 filters were washed at 55°C rather than 45°C, while for 
the NR6 cDNA fragment prehybridization and hybridization 
was carried out in 2xSSC and filters were washed at 0.2 
x SSC at 65°C. Again, as described in (i) above, 
positively hybridising plaques were purified, the cDNAs 
10 were recovered and cloned into plasmids pBluescript II 
or pUC19. Independent cDNA clones were sequenced on 
both strands . 

Using this procedure, 6 further clones, 68-5, 68-35, 68- 
15 41, 68-51, 68-77 and 73-23, contained large regions of 
sequence identity with 68-1, 68-2, MBC-8 and pKUSA166. 

In a parallel series of experiments, further screening 
was performed with hybridization probes prepared from 

20 the 1.7 kbp EcoRI-XhoI fragment excised from pKUSA166. 

This fragment was excised and labeled with 32 P by using 
T7QuickPrime Kit (Pharmacia Biotech) . Approximately 
6x1 0 5 clones were screened. Hybond N+ filters 
(Amersham) were first prehybridized for 4hr at 42°C in 

25 50% (v/v) formamide, SxSSPE, 5xDenhardt 1 s solution, 0.1% 
(w/v) SDS, and O.lmg/ml denatured salmon sperm DNA. 
Hybridization was for 16 hours under the same conditions 
with the addition of 32 P- labelled NR6- cDNA fragment 
probes. Finally the filters were washed once for lhr in 

30 0 .2xSSC, 0.1% (w/v) SDS at 68°C. Eight clones were 

isolated, and phage clones were subjected to the in vivo 
excision of the pBluescript SK(-) phagemid (Stratagene) . 
The plasmid DNAs were prepared by the standard method. 
DNA sequences were determined by an ABI PRISM 377 DNA 

35 Sequencer using appropriate synthetic oligonucleotide 
primers . 
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Using this procedure 8 further clones from the KUSA 
library contained large regions of sequence identity 
with 68-1, 68-2, MBC-8, pKUSA166, 68-5, 68-35, 68-41, 
68-51, 68-77 and 73-23 were isolated. 

5 

EXAMPLE 4 
Isolation of genomic DNA encoding NR6 

DNA encoding the murine NR6 genomic locus was also 

10 isolated using the 68-1 cDNA as a probe. Two positive 
clones, 2-2 and 57-3, were isolated from a mouse 129/Sv 
strain genomic DNA library cloned into X FIX. These 
clones were overlapping and the position of the 
restriction sites, introns and exons were determined in 

15 the conventional manner. The region of the genomic 

clones containing exons and the intervening introns were 
sequenced on both strands using an Applied Biosystems 
automated DNA sequencer, with fluorescent 
dideoxynucleotide analogues according to the N 

20 manufacturer's instructions. Figure 2 shows the 
nucleotide sequence and corresponding amino acid 
sequence of the translation regions. This is also shown 
in SEQ ID NOs:30 and 31. Figure 3 provides the genomic 
NR6 gene sequence but with additional 5N sequence. This 

25 is also represented in SEQ ID NO: 38 in relation to this 
sequence. The coding exons of NR6 span approximately 
llkb of the mouse genome. There are 9 coding exons 
separated by 8 introns : 



exonl 


at least 239nt 


intronl 


5195nt 


exon2 


282nt 


intron2 


214nt 


exon3 


130nt 


intron3 


107nt 


exon4 


170nt 


intron4 


1372nt 


exon5 


158nt 


intronS 


68nt 


exon6 


169nt 


intron6 


2020nt 


exon7 


188nt 


intron7 


104nt 


exon8 


43nt 


intronS 


181nt 
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exon9 2 52nt 

Exon 1 encodes the signal sequence, exon 2 the Ig-like 
domain, exons 3 to 6 the hemopoietin domain. Exons 7, 8 
5 and 9 are alternatively spliced. 

i 



EXAMPLE 5 

10 5N RACE analysis of NR6 

5 1 -RACE was used to investigate the nature of the 
sequence 5' of nucleotide 960, encoding Ile321 of NR6.1, 
2 and 3. The nucleotide and corresponding amino acid 

15 sequences are shown in SEQ ID NOs:12, 14 and 16, 

respectively. 5' -RACE was performed using Advantage 
KlenTaq polymerase (clontech, cat no. K1905-1) on mouse 
brain Marathon- ready cDNA (clontech, cat no. 7450-1) 
according to the manufacturer's instructions. Briefly, 

20 the first rounds of amplification were performed using 
5/xl of cDNA in a total volume of 50/xl, with ImM each of 
the primers AP1&M116 [SEQ ID NO:2] or AP1&M159 [SEQ ID 
NO:4] by 35 cycles of 94°C x 0.5min, 68°C x 2.0min on 
GeneAmp 2400 (Perkin-Elmer) . An amount of 5/il of 50- 

25 fold diluted product from the first amplification was 
then re-amplified ; for the products generated with 
primers API and Ml 16 [SEQ ID N0:2] in the first 
amplification, 1 mM of the primers AP2&M108 [SEQ ID 
N0:3] were used in the second amplification. For the 

3 0 products generated with primers API and Mil 6 [SEQ ID 

NO:2] in the first amplification, two separate secondary 
reactions were performed, one reaction with 1 mM primers 
AP2&M242 [SEQ ID NO: 5] and the other with 1 mM primers 
AP2&M112 [SEQ ID NO: 6] . Amplification was achieved 

35 using 25 cycles of 94^c x O.Smin, 68°C x 2.0min. These 
samples were analyzed by agarose gel electrophoresis. 
When a single ethidium bromide staining amplification 
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product was observed, it was purified by QIAquick PCR 
purification kit according to the manufacturers 
instructions (qiagen, cat no. DG-02 81) and its sequence was 
directly determined using both primers used in the 
5 secondary amplification step, that is AP2 and either 

M108 [SEQ ID N0:3], M242 [SEQ ID N0:5] or M112 [SEQ ID 
NO: 6] . 

EXAMPLE 6 

10 Cloning of NR6 

From the initial screens of mouse brain and testis cDNA 
libraries with the degenerate WSXWS oligonucleotides and 
subsequent screening of cDNA libraries from mouse 
15 testis, mouse brain and the KUSA osteoblastic cells line 
a total of 18 NR6 cDNAs have been isolated. Nucleotide 
sequence of NR6 was also determined from 5 1 RACE analysis 
of brain cDNA. Additionally, two murine genomic DNA 
clones encoding NR6 have also been isolated. 

20 

Comparison of the NR6 cDNA clones revealed a common 
region of nucleotide sequence which included a 123 base 
pairs 5 1 -untranslated region and 1221 base pairs open 
reading frame, stretching from the putative initiation 

25 methionine, Metl to Gln407 (SEQ ID NOs:12, 14 and 16, 

respectively) . Within this common open reading frame, a 
haemopoietin receptor domain was observed which 
contained the four conserved cysteine residues and the 
five amino acid motif WSXWS typical of members of the 

30 haemopoietin receptor family, was observed. 

Further analyses revealed that after nucleotide 1221, 
three different classes of NR6 cDNAs could be found, 
these were termed NR6.1, NR6.2 and NR6.3 (SEQ ID NOs:12, 
35 14 and 16, respectively) . Each encoded a receptor that 
appeared to lack a classical transmembrane domain and, 
would, therefore be likely to be secreted into the 
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extracellular environment. Although the putative C- 
terminal region of the three classes of NRG proteins 
appear to be different, the cDNAs encoding them also had 
a common region of 3 1 -untranslated region. 

5 

With regard to SEQ ID N0s:12, 14 and 16, the number of 
both nucleotides and amino acids begins at the putative 
initiation methione. NR6.1 and NR6.2 are identical to 
nucleotide 1223 encoding Q407, this represents the end 

10 of an exon. NR6.1 splices out an exon present only in 
NR6.2 and uses a different reading frame for the final 
exon which is shared with NR6.2. The 3N-untranslated 
region is shared by NR6.1, NRG. 2 and NR6.3, NR6.2 
splices in an exon starting with nucleotide 1224 

15 encoding D408 and ending with nucleotide 1264 encoding 
the first nucleotide in the codon for G422 and uses a 
different reading frame for the final exon which is 
shared with NRG. 2 (see Figure 1). NR6.3 fails to splice 
from position nucleotide 1224, therefore, translation 

20 continues through the intron, giving rise to the C- 
terminal protein region. 

The sequence of NR6 cDNA products generated by 5 » -RACE 
amplification from mouse brain cDNA preparation is 

25 shown in SEQ ID NO: 18. The nucleotide sequence 

identified using 5 1 -RACE appeared to be identical to the 
sequence of cDNAs encoding NRG . 1 , NR6.2, and NRG. 3 from 
nucleotide C151, the first nucleotide for the codon for 
Pro51. 5' of this nucleotide, the sequences diverged 

30 and the sequence is unique not being found in NR6.1, 
NRG. 2 or NR6.3. Additionally, there is a single 
nucleotide difference, with the sequence from the RACE 
containing an G rather than an A at nucleotide 475, 
resulting in Thrl59 becoming Ala. 

35 

Analysis of the genomic clones, revealed that they were 
overlapping and contained exons encoding the majority of 
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10 



20 



30 



35 



the coding region of the three forms of NR6 (Figures 1, 
2 and 3) . These genomic clones, contained exons 
encoding from Asp50 (nucleotide 14 8) of the NR6 cDNAs. 
Sequence 5' of this in the cDNAs, including the 5'- 
untranslated region and the region encoding Metl to 
Gln49 (SEQ ID NOs:12, 14 and 16), and the 5» end 
predicted from analysis of 5» RACE products (SEQ ID 
NO: 18) were not present in the two genomic clones 
isolated. 



Analysis of the NR6 genomic DNA clones also provided an 
explanation of the three classes of NRG cDNAs found. It 
is likely that NR6.1, NR6.2 and NR6.3 arise through 
alternative splicing of NR6 tnRNA (Figure 1) . The last 
15 amino acid residue that these different NR6 proteins are 
predicted to share is Gln407. SEQ ID NO: 18 shows that 
Gln4 07 is the last amino acid encoded by the exon that 
covers nucleotides g5850 to g603 7 (see Figure 2) . 
Alternative splicing from the end of this exon (Figure 
1) accounts for the generation of cDNAs encoding NR6.1 
(SEQ ID NO:12), NR6 . 2 (SEQ ID NO:14) and NR6.3 (SEQ ID 
NO:16). In the case of NR6.1, the region from g6038 to 
g6425 is spliced out, leading to juxtaposition of g6037 
and g6426. In the case of NR6.2, the region from g603 8 
25 to 6141 is spliced out, an exon from 6142 to g6183 is 
retained and then this is followed by splicing out of 
the region from g6183 to g6425. NR6.3 appears to arise 
when there is no splicing from nucleotide g6038. For 
all three forms, a secreted rather then transmembrane 
form is generated, these differ however in their 
predicted C- terminal region. The genomic NR6 sequence 
with additional 5N sequence is shown in Figure 3. 



EXAMPLE 7 
ESTs 



Databases were searched with the murine NR6 
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corresponding to the unspliced version shown in SEQ ID 
NO: 16. The murine NRG sequence used is shown in SEQ ID 
NO: 22. 

v The databases searched were: 

5 

(i) dbEST - Database of Expressed Sequence Tags 
National Center for Biotechnology Information National 
Library of Medicine, 38A, 8N8058600 Rockville Pike, 
Bethesda, MD 20894 Phone: 0011-1-301-496-2475 Fax: 

10 0015-1-301-480-9241 USA. 

(ii) DNA Data Bank of Japan DNA Database Release 3 689. 
Prepared by: Sanzo Miyazawa Manager /Database 
Administrator HidenoriHayashida Scientific Reviewer 

15 Yukiko Yamazaki/Eriko Hatada/Hiroaki Serizawa 

Annotators/reviewers Motono Horie/Shigeko Suzuki/Yumiko 
SataoSecretaries/typists DNA Data Bank of JapanNational 
Institute of Genetics Center for Genetic Information 
research Laboratory of Genetic Information Analyses 1111 

20 YataMishima, Shizuoka 411 Japan. 

(iii) EMBL Nucleic Acid Sequence Data Bank Release 
47.0. 

25 (iv) EMBL Nucleic Acid Sequence Data Bank Weekly Updates 
Since Release 44. 

(v) Genetic Sequence Data Bank NCBI-GenBank Release 94 
National Center for Biotechnology Information National 

30 Library of Medicine, 38A, 8N805 8600 Rockville Pike, 
Bethesda, MD 20894 Phone: 0011-1-301-495-2475 Fax: 
0015-1-301-480-9241 USA. 

(vi) Cumulative Updates since NCBI-GenBank Release 88 
35 National Center for Biotechnology Information National 

Library of Medicine, 38A, 8N805 8600 Rockville Pike, 
Bethesda, MD 20894 USA. 
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The search of the databases with the murine probe 
identified several EST's having sequence similarity to 
the probe. The EST's were: 

5 W66776 (murine sequence) 
MM583 9 (murine sequence) 
AA014965 (murine sequence) 
W46604 (human sequence) 
W46603 (human sequence) 
10 H14 009 (human sequence) 
N78873 (human sequence) 
R87407 (human sequence) . 

EXAMPLE 8 

15 Isolation of 3N cDNA clones encoding human NR6 

PCR products encoding human NR6 were generated using 
oligonucleotides UP1 and LP1 (see below) based on human 
ESTs (Genbank Acc:H14009, Genbank Acc : AA042914 ) that 
2 0 were identified from databases searched with murine NR6 
sequence (SEQ ID NO: 22) . PCR was performed on a human 
fetal liver cDNA library (Marathon ready cDNA CLONTECH 
#7403-1) using Advantage Klen Taq Polymerase mix 
(CLONTECH #8417-1) in the buffer supplied at 941C fro 
25 30s and 681C for 3 min for 35 cycles followed by 681C 
for 4 min and then stopping at 151C. A standard PCR 
programme for the Perkin- Elmer GeneAmp PCT system 2400 
thermal cycle was used. The PCR yielded a prominent 
product of approximately 560 base pairs (bp; SEQ ID 
30 NO:18), which was radiolabeled with [»'- 32 P] dCTP using a 
random priming method (Amersham, RPN, 1607, Mega prime 
kit) and used to screen a human fetal kidney 5N- STRETCH 
PLUS cDNA library (CLONTECH #HL1150x) . Library screens 
were performed using Rapid Hybridisation Buffer 
35 (Amersham, RPn 1636) according to manufacturer's 

instructions and membranes washed at 651C for 30 min in 
0.1xSSC/0.1% (w/v) SDS. Two independent cDNA clones 
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were obtained as lambda phage and subsequently subcloned 
and sequenced. Both clones (HFK-63 and HFK-66) 
contained 1.4 kilobase (kb) inserts that showed sequence 
similarity with murine NR6 . The sequence and 
5 corresponding amino acid translation of HFK-66 is shown 
in SEQ ID NO: 24. 

The translation protein sequences of clone HFK-66 shows 
a high degree of sequence similarity with the mouse NR6 . 

10 

OLIGONUCLEOTIDES 

UP1: 5NTCC AGG CAG CGG TCG GGG GAC AAC 3N [SEQ ID NO: 26] 
LP1: 5N TTG CTC ACA TCG TCC ACC ACC TTC 3N [SEQ ID 
NO:27] 

15 

EXAMPLE 9 
Genomic Structure of Human NR6 

Human genomic DNA clones encoding human NR6 was 

20 isoloated by screening a human genomic library (Lambda 
FIXJII Stratagene 946203) with radiolabeled 
oligonucleotides, 2199 and 2200 (see below) . These 
oligonucleotides were designed based on human ESTs 
(Genbank Acc:R87407, Genbank Acc:H14009) that were 

25 identified from databases searched with murine NR6 . 
Filters were hybridised overnight at 371C in 6xSSC 
containing 2 mg/ml bovine serum albumin, 2 mg/ml Ficoll, 
2mg/ml polyvinylpyrrolidone, 100 mM ATP, 10 mg/ml tRNA, 
2 mM sodium pyrophosphate, 2 mg/ml salmon sperm DNA, 

30 0.1% (w/v) SDS and 200 mg/ml sodium azide and washed at 
651C in 6 x SSC/0.1% SDS. Five independent genomic 
clones were obtained and sequenced. The extend of 
sequence obtained has determined that the clones overlap 
and exhibit a similar genomic structure to murine NR6 . 

35 Exon coding regions are almost identical over the region 
covered by the genomic clones while intron coding 
regions differ, although the size of the introns are 
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comparable. The extent of known overlap is shown in 
Fig. 5. 

OLIGONUCLEOTIDES : 

5 

2199: 5N CCC ACG CTT CTC ATC GGA TTC TCC CTG 3N [SEQ ID 
N0:36J 

22 00; 5N CAG TCC ACA CTG TCC TCC ACT. CGG TAG 3N (SEQ ID 
NO:37] 

10 

EXAMPLE 10 

Northern Blot Analysis of Human NR6 mRNA Expression 

15 Clontech Multiple Tissue Northern Blots (Human MTN Blot, 
CLONTECH #7760-1, Human MTN Blot IV, CLONTECH #7766-1, 
Human Brain MTN Blot II, CLONTECH #7755-1, Human Brain 
MTN Blot III, CLONTECH #7750) were probed with a 
radiolabeled 3N human NR6 cDNA clone, HFK-66 (SEQ ID 

20 NO: 24) . The clone was labelled with [ ft - 32 P] dCTP using a 
random priming method (Amersham, RPN 1607, Mega prime 
kit) . Hybridisation was performed in Express 
Hybridisation Solution (CLONTECH H50910) for 3 hours at 
671C and membranes were washed in O.lxSSC/0.1% w/v SDS 

25 at 501C. 

A 1.8 kb transcript was detected in a variety of human 
tissues encompassing reproductive, digestive and neural 
tissues. High levels were observed in the heart, 

3 0 placenta, skeletal muscle, prostate and various areas of 
the brain, lower levels were observed in the testis, 
uterus, small intestine and colon. Photographs showing 
these Northern blots are available upon request. This 
expression pattern differs from the expression pattern 

35 observed with murine NR6 . 

EXAMPLE 11 
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Mouse NR6 Expression Vectors 



pEF-FLAG/mNR6.1 



The mature coding region of mouse NR6.1 was amplified 
using the PCR to introduce an in- frame Asc I restriction 
enzyme site at the 5 » end of the mature coding region 
and an Mlu I site at the 3' end, using the following 
oligonucleotides : - 

aligg 5N- AGCTGGCGCGCCTCCCGGGCGGATCGGGAGCCCAC - 3N [SEQ 
ID NO:30] 

3N Qli9Q 5N-AGCTACGCGTTTAGAGTTTAGCCGGCAG- 3N [SEQ ID 
N0:31] 

The resulting PCR derived DNA fragment was then digested 
with Asc I and Mlu I and cloned into the Mlu I site of 
pEF-FLAG . Expression of NR6 is under the control of the 
polypeptide chain elongation factor la promoter as 
described (16) and results in the secretion, using the 
IL3 signal sequence from pEF-FLAG, of N-terminal FLAG- 
tagged NR6 protein. 



pEF-FLAG was generated by modifying the expression 
25 vector pEF-BOS as follows :- 

pEF-BOS (16) was digested with Xba I and a linker was 
synthesized that encoded the mouse IL3 signal sequence 
( MVLASSTTS IHTMLLLLLMLFHLGLQAS I S ) and the FLAG epitope 
30 (DYKDDDDK) . Asc I and Mlu I restriction enzyme sites 

were also introduced as cloning sites. The sequence of 
the linker is as follows:- 

MVLASSTTSIHT 

35 M 

CTAGACTAGTGCTGACACAATGGTTCTTGCCAGCTCTACCACCAGCATCCACACCA 
TG 
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TGATCACGACTGTGTTACCAAGAACGGTCGAGATGGTGGTCGTAGGTGTGGTAC 
LLLLLMLFHLGLQAS I S Asc 

5 I 

CTGCTCCTGCTCCTGATGCTCTTCCACCTGGGACTCCAAGCTTCAATCTCGGCGCG 
CC 

GACGAGGACGAGGACTAGCAGAAGGTGGACCCTGAGGTTCGAAGTTAGAGCCGCGC 
GG 

10 

DYKDDDDK Mlu I 
AGGACTACAAGGACGACGATGACAAGACGCGTGCTAGCACTAGT 

TCCTGATGTTCCTGCTGCTACTGTTCTGCGCACGATCGTGATCAGATC 

15 

The two oligonucleotides were annealed together and 
ligated into the Xba I site of pEF-BOS to give pEF-FLAG. 

pCOSl/ FLAG/mNR6 & pCH01/FLAG/mNR6 

20 

A DNA fragment containing the sequences encoding IL3 
signal sequence/Flag/mNR6 and the poly (A) adenylation 
signal from human G-CSF cDNA, was excised from pEF- 
FLAG/mNR6 using the restriction enzyme EcoR I. This DNA 
25 fragment was then inserted into the EcoR I cloning site 
of pCOSl and pCHOl 

The pCOSI and pCHOl vectors were constructed as follows. 
pCHOl is also described in reference (17) but with a 
30 different selectable marker. 

pCOSl was prepared by digesting HEF-12h-g"l (see Figure 
24 of International Patent Publication No. WO 92/19759) 
with EcoRI and Smal and ligating the digesting product 
35 iwht an EcoRI -JVotl-BamHI adaptor (Takara 4510) . The 

resulting plasmid comprises an EFI " promoter/enhancer, 
Nco r marker gene, SV40E, ori and an Amp r marker gene. 
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pCHOl was constructed by digesting DHFR-PMh-grl (see 
Figure 25 of International Patent Publication No. WO 
92/19759) with Pvul and Eco47III and ligating same with 
pCOSI digested with Pvul and £co47III. The resulting 
5 vector, pCHOl, comprises an EFI " promoter /enhancer , an 
DHFR marker gene, SV40E, Ori and a Amp r gene. 



EXAMPLE 12 

10 

mRN6 has been expressed as an NN Flag tagged protein 
following transfection of CHO cells and as a CN Flag 
tagged protein following transfection of KUSA cells in 
both cases varying levels of dimeric and aggregated NR6 
15 were secreted. 



EXAMPLE 13 
Murine NR6 expression 

20 

NR6 expression studies were conducted in murine Northern 
Blots. At the level of sensitivity used in the adult 
mouse, NR6 expression was detected in salivary gland, 
lung and testis. During embryonic development, NR6 is 

25 expressed in fetal tissues from day 10 of gestation 
through to birth. In cell lines, NR6 expression has 
been observed in the T- lymphoid line CTLL-2 as well as 
in FD-PyMT (FDC-P1 myeloid cells expressing polyoma 
midle T gene) , and f ibroblastoid cells including bone 

30 marrow and fetal liver stromal lines. 

EXAMPLE 14 

Expression, purification and characterisation of CHO and 
KUSA xnNR6 

35 

The methods provide for the production of a dimeric form 
of CHO derived NN FLAG-mNR6 without refolding. All 

- 56 - 



SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



other methods are capable of producing NR6 and are 
encompassed by the present invention. 

A. Production of CHO derived N 1 FLAG-mNR6 (dimeric 
5 form) 

(i) Protein Production 

To analyse structure and functional activity, a cDNA 
fragment containing the entire coding sequence of murine 

10 NR6 with an N-terminal FLAG (NN FLAG) sequence was 
cloned into the EcoRl site of the expression vector 
pCHOl. For stable production of N-terminal FLAG-tagged 
NR6 the vector contains the DHFR (dihydrof olate 
reductase) gene as a selective marker with the NRG gene 

15 under the control of an EFla promoter. CHO cells were 
transfected with the construct using a polycationic 
liposome transfection reagent (Lipofectamine, GibcoBRL) . 

(ii) Lipofectamine transfection method 

20 

Using six well tissue culture plates either 2 x 10 5 KUSA 
cells in 2ml IMDM + 10% (v/v) FCS or 2 x 10 5 CHO cells 
were cultured in 2ml M -MEM + 10% (v/v) FCS until 70% 
confluent. 2Fg DNA diluted in 100F1 OPTI-MEM I (Gibco 

25 BRL, USA) was mixed gently with 12F1 lipofectamine 
diluted in 100F1 OPTI-MEM I and incubated at room 
temperature for 30min to allow DNA complex formation. 
DNA complexes were gently diluted in a total volume of 
lml of OPTI-MEM I and overlaid onto washed KUSA or CHO 

3 0 cell monolayers. A further lml IMDM + 20% (v/v) FCS 
(KUSA cells) or lml "-MEM + 20% (v/v) FCS (CHO cells) 
was added to transfected cells after 5 hours. At 24 
hours, the culture medium was replaced with fresh 
complete growth medium. At 4 8 hours after transfection, 

35 selection was applied. A methotrexate resistant clone 
secreting comparatively high levels of NRG was selected 
and expanded for further analysis. 
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(iii) Protein expression 

CHO cells were grown to confluence in roller bottles in 
nucleoside free "-MEM + 10% (v/v) FCS . Selection was 
5 maintained by using 100 ng/ml Methotrexate in the 
conditioned media according to manufacturer 
instructions. Expression was monitored by Biosensor and 
harvesting found to be optimal at 3 to 4 days. 

10 B. Protein Analysis 

(i) Biosensor analysis 

Expression and purification was monitored by Biosensor 
15 analysis (BiaCoreTM, Sweden) where anti FLAG peptide M2 
antibody (Kodak Eastman, USA) , specific for the FLAG 
peptide sequence was bound to the sensorchip. Fractions 
were analysed for binding to the sensor surface 
(resonance units) and the sample then removed from the 
20 surface using 50 mM Diethylamine pH 12.0 prior to 
analysis of the next fraction. Immobilisation and 
running conditions of the Biosensor follow the 
manufacturer's instructions. 

25 (ii) Protein Production 

In order to generate and characterise NR6, conditioned 
media (2 L) produced by CHO cells was harvested after 
day 3, post confluence. Conditioned media was 
30 concentrated using diaf iltration with a 10,000 molecular 
weight cut-off. (Easy flow, Sartorius, Aus) . At a volume 
of 200 ml (i.e. 10 x concentrated) the sample was buffer 
exchanged into 20 mM Tris, 0.15M NaCl, 0.02% (v/v) Tween 
20 pH 7.5 (Buffer A) . 

35 

(iii) Immunoprecipitation and Western Blot analysis 

of mNR6 
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Concentrated conditioned media (1ml) was 
immunoprecipitated with M2 affinity resin (20F1, Kodak 
Eastman) . To examine the structural characterisation of 
mNR6 SDS PAGE was performed under reducing and non- 
5 reducing conditions. Separation was performed on NOVEX 
4-20% (v/v) Tris/glycine gradient gels and protein 
transfered on PVDF membrane. Western blots were probed 
with biotinylated M2 antibody (primary, 1:500) and then 
streptavidin peroxidase (secondary, 1:3000) . Samples 
10 were visualised by autoradiography using 

electrochemiluminescence (ECL, Dupont, USA) . 

By regressional analysis of prestained standards 
(BIORAD, Aus.) the molecular weight of the monomeric 

15 unit was calculated to be 65,0 00 daltons. Under non- 
reducing conditions the molecular weight was calculated 
to be 12 7,000 indicating that NR6 is a disulphide linked 
dimer. A tetrameric complex running at approximately 
250,000 daltons was also observed. Although a band 

20 running at approximately 50,000 daltons was observed, no 
monomeric NR6 was detected under non-reducing conditions 
indicating that the majority of NRG expressed in this 
system is disulphide linked. 

2 5 (iv) Affinity Chromatography of mNR6 

Concentrated conditioned media (200 ml) was applied to 
M2 affinity resin (5ml) under gravity. To enhance 
recovery the unbound fraction was reapplied to the 

30 column four times prior to extensive washing of the 

column with 200 volumes of Buffer A. Biosensor analysis 
indicates that approximately 2 0% of the M2 binding 
originally present in the concentrate remains in the 
unbound fraction. The bound fraction was eluted from the 

35 column using an immunodesorbant (50 ml ) ; actisep 
(Sterogene Labs, USA) . 
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(v) Ion exchange and Desalting of mNR6 

In order to buffer exchange mNR6 prior to anion 
chromatography, 10 ml batches of the eluted fraction (50 
5 ml) were applied to an XK column (400 x 26 mm I.D.) 
containing G25 sepharose (Pharmacia, Sweden) . 
Chromatography was developed at 4 ml/min using an FPLC 
(Pharmacia, Sweden) equipped with an online UV280 and 
conductivity monitor. The mobile phase was 10 mM Tris, 
10 0.1M NaCl, 0.02% v/v Tween, pH 8 . 0 . 10 ml fractions were 
collected between 12,5 min and 25 min to optimise 
recovery and removal of salt. Fractions were analysed by 
Biosensor analysis and pooled according to binding. 

15 All pooled active fractions were diluted with an equal 
volume of 20 mM Tris, 0.02% (v/v) Tween, pH 8.5 (Buffer 
B) and then loaded onto a Mono Q 5/5 (Pharmacia, Sweden) 
at a flow rate of 2 ml/min. The column was washed with 
buffer B. Elution was performed using a linear gradient 

20 between buffer B and buffer B containing 0 . 6M NaCl over 
30 min at a flow rate of 1 ml/min. Fractions (1 minute) 
were collected and analysed on the Biosensor and also by 
SDS PAGE and Western blot analysis. Fractions 15 to 26 
(approximately 0.4M NaCl) appear to contain the majority 

25 of mNR6 as indicated by the Biosensor. 

C. Production of CHO derived N 1 FLAG-xnNR6 (monomeric 
form) 

30 (i) Protein Production I 

A cDNA fragment containing the entire coding sequence of 
murine NR6 with an N-terminal FLAGJ sequence was cloned 
into the expression vector pCHOl for production of N- j 
35 terminal FLAG- tagged protein. This vector contains a j 
neomycin resistance gene with expression of the NR6 gene • 
under the control of an EF1" promoter. This expression 
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construct was transfected into CHO cells using 
Lipofect amine {Gibco BRL, USA) according to the 
manufacturer instructions. Transfected cells were 
cultured in IMDM + 10% (v/v) FCS with resistant cells 
5 selected in geneticin (600Fg/ml, Gibco BRL, USA) . A 

neomycin resistant clone, secreting comparatively high 
levels of NRG was selected and expanded for further 
analysis . 

10 (ii) Protein expression 

N 1 FLAG-NR6 expressed in serum free conditioned media 
(10 litre) was harvested from transfected CHO and cells. 
Collected media was concentrated using a CH2 

15 ultrafiltration system equipped with a S1Y10 cartridge 
(Amicion molecular weight cut-off 10,000). Preliminary 
examination of the expressed product under reducing and 
non-reducing SDS PAGE followed by western blot analysis 
was performed. Visualisation of the protein on Westerns 

20 was specific to the primary antibody anti FLAG M2 . Under 
reducing conditions a band approximately at 65,000 
daltons was observed. Under non-reducing conditions, 
dimer and larger molecular weight aggregates were 
observed. These are disulphide linked monomers as they 

25 are not present in the reducing gel. Small amounts of 
monomer appear to be present in non-reducing gels. 

(iii) Affinity Chromatography of NR6 

Concentrated conditioned media was applied to an anti 
30 FLAG M2 affinity resin (100 x 16 mm I.D.). After washing 
the unbound proteins off the column, the bound proteins 
were eluted using FLAG peptide (60Fg/ml) in PBS. 

(iv) Ion Exchange Chromatography of NR6 

35 

Eluted fractions from affinity column were dialysed 
overnight against 20 mM Tris-HCl pH 8 . 5 (buffer C) 
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containing 50 mM Dithiothretol (DTT) using 25,000 cut- 
off dialysis tubing (Spectra/Por7, Spectrum) . The 
dialysed fractions were loaded onto Mono Q 5/5 
(Pharmacia, Sweden) previously equilibrated with buffer 
5 C containing 5 mM DTT. Chromatography was developed 

using a linear gradient between buffer C and buffer C 
containing 1.0 M NaCl at a flow rate of 0.5 ml / min. 

(v) Refolding of NR6 

10 

Fractions containing NR6 from the Mono Q were adjusted 
to 50 mM DTT and left overnight at 41C. To initiated 
refolding the sample was then dialysed against 50 mM 
Tris-HCl (pH 8.5), 2 M Urea, 0.1% (v/v) Tween 20, 10 mM 
15 Glutathione (reduced) and 2 mM Glutathione (oxidised) at 
a final protein concentration of 100 Fg / ml. Folding 
was carried out at ambient temperature with one change 
of the buffer over 24 hours. 

20 (v) Reversed Phase High Performance Liquid 
Chroma t ography ( RP - HPLC ) 

The folded product was further purified by RP-HPLC using 
a Vydac C4 resin (250 x 4.6 mm I.D.) previously 
25 equilibrated with 0.1% (v/v) Trif luoroacetic acid (TFA) . 
Elution was carried out using a linear gradient from 0 
to 80% (v/v) acetonitrile / 0.1% (v/v) TFA at a flow 
rate of 1 ml per minute. 

30 D. pCH01/NR6/FLAG 

In order to determine the native N termini of NR6, a C 
terminal FLAG NR6 CHO cell line was established. 

The plasmid pKUSA166 (murine NR6 cDNA cloned into the 
35 EcoR I site of pBLUESCRIPT) was digested with BamH I to 
remove the sequences encoding the last 15 amino acids of 
murine NRG. Synthetic oligonucleotides which encode the 
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3 1 end of mouse NR6 followed by the FLAG peptide tag 
were annealed and ligated into the BamH I site of 
pKUSA166. The sequence of the oligonucleotides was as 
follows : - 

5 

ILP SGRRGAARGPAGDYKD 
D D D K * [SEQ ID NO: 34] 

GATCTTGCCCTCGGGCAGACGGGGTGCGGCGAGAGGTCCTGCCGGCGACTACAAGG 
10 ACGACGATGACAAGTA G [SEQ ID NO: 33] 

AACGGGAGCCCGTCTGCCCCACGCCGCTCTCCAGGACGGCCGCTGATGTTCCTGCT 
GCTACTGTTCATCCTAG [SEQ ID NO: 35] 

The 5 1 end of the linker introduces a silent mutation 
15 (CTG > TTG) , to destroy the 5' BamH I site upon 

insertion of the linker. The NR6 cDNA (with native 
signal sequence) with the C- terminal FLAG was cut out of 
pKUSA166 with EcoR I and BamH I and cloned into the EcoR 
I - BamH I cloning sites of pCHO-1. This vector results 

2 0 in the secretion of NR6 protein with a C- terminal flag 

tag (CN FLAG-mRN6) . 

This vector results in the secretion of NR6 protein from 
KUSA cells. The vector pCHOl has been previously 
25 described in (17) although with a different secretable 
marker . 

(i) Production of polyclonal NR6 antiserum 

3 0 The following peptide from the N terminal area of NR6 

was chosen for production of polyclonal antiserum to NR6 

VIS PQDPTLL I GS S LQATCS I HGDT P [SEQ ID NO: 39] 

3 5 The peptide was conjugated to KLH and injected into 

rabbits. Production and purification of the polyclonal 
antibody specific to the NR6 peptide sequence follows 
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standard methods. 

(ii) Protein expression 

5 KUSA cells transfected with cDNA of C terminal tagged 
mNR6 were grown to confluence in flasks (800ml) using 
IMDM media containing 10% (v/v) FBS. Conditioned media 
(100 ml) was harvested 3 -4 days post confluence. 

10 (iii) Characterisation of NR6 by Immunoprecipitation 

and Western blotting 

In order to establish that NR6 with the predicted 
sequence is produced in KUSA cells transfected with the 

15 cDNA, western blot analysis using both M2 antibody and 
purified NR6 specific rabbit antibody were performed. 
Conditioned media (1 to 5 ml) was immunoprecipitated 
with M2 affinity resin (10-20 Fl) . Then after sufficient 
time for binding, the beads were washed with MT-PBS and 

20 subsequently NR6 eluted with 100 Fg/ml FLAG peptide (40 
PI, (1, 5 minute incubation) . The sample was then 
subjected to reducing and non reducing SDS PAGE followed 
by western blot analysis. Both purified NR6 polyclonal 
antibody (purified by protein G) and M2 antibody 

25 recognise a band under reducing conditions of a 

molecular weight size approximately 65,000 daltons. 
Since the two antibodies reconising resides at the N 
terminus and C terminus it is reasonable to assume that 
full length NR6 is produced. Biotinylation of the 

30 respective antibodies by standard methods reduces the 

background. Under non- reducing conditions polyclonal NR6 
bind antibodies to a band of a molecular weight of 
approximately 127,000, consistent with a dimeric NR6 
disulphide linked form. Minor components of tetrameric 

35 NR6 are present, no monomeric NR6 is evident using 
polyclonal NR6 antibodies. 
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EXAMPLE 15 
Generation of NR6 knockout mice 

To construct the NR6 targeting vector, 4 . lkb of genomic 
5 NR6 DNA containing exons 2 through to 6 was deleted and 
replaced with G418 -resistance cassette, leaving 5N and 
3N NR6 arms of 2 . 9 and 4.5 kb respectively. A 4.5 kb 
Xhol fragment of the murine genomic NR6 clone 2.2 
(Figure 3) containing exons 7, 8 and 3N flanking 

10 sequence was subcloned into the Xhol site of pBluescript 
generating pBSNR6Xho4 . 5 . A 2 . 9kb Notl-Stul fragment 
within NR6 intron 1 from the same genomic clone was 
inserted into NotI and EcoRV digested pBSNR6Xho4.5 
creating pNR6-Ex2-6. This plasmid was digested with 

15 Clal, which was situated between the two NR6 fragments, 
and following blunt ending, ligated with a blunted 6kb 
Hindi I I fragment from placZneo, which contains the 
lacZgene and a PGKneo cassette, to generate the final 
targeting vector, pNRGlacZneo. pNR61acZneo was 

20 linearised with NotI and electroporated into W9.5 

embryonic stem cells. After 48 hours, transfected cells 
were selected in 175 Fg/ml G418 and resistant clones 
picked and expanded after a further 8 days. 

25 Clones in which the targetting vector had recombined 
with the endogenous NR6 gene were identified by 
hybridising Spel-digested genomic DNA with a 0.6 kb 
XhoI-StuI fragment from genomic NR6 clone 2.2. This 
probe (probe A, Figure 4) , which is located 3N to the 

30 NR6 sequences in the targeting vector, distinguished 
between the endogenous (9.9 kb) and targeted (7.1 kb) 
NR6 loci (Figure 5) . 

Genomic DNA was digested with Spel for 16hrs at 371C, 
35 electrophoresed through 0.8% (w/v) agarose, transferred 
to nylon membranes and hybridised to 32 P-labelled probe 
in a solution containing 0.5M sodium phosphate, 7% (w/v) 
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SDS, lmM EDTA and washed in a solution containing 4 0mM 
sodium posphate, 1% (w/v) SDS at 65 1C. Hybridising 
bands were visualised by autoradiography for 16 hours at 
-701C using Kodak XAR-5 film and intensifying screens. 
5 Two targeted BS cell clones, W9.5NR6-2-44 and W9.5NR6-4- 
2, were injected into C57B1/6 blastocysts to generate 
chimeric mice. Male chimeras were mated with C57B1/6 
females to yield NR6 heterozygotes which were 
subsequently interbred to produce wild- type (NR6* /+ ) , 
10 heterozygous (NR6 V ~) and mutant (NR6"'~) mice. The 

genotypes of offspring were determined by Southern Blot 
analysis of genomic DNA extracted from tail biopsies. 

Genotyping of mice at weaning from matings between NR*'~ 
15 heterozygous mice derived from both targated ES cell 

clones revealed an absence of homozygous NR6~'" mutants. 

As no unusual loss of mice was observed between birth 

and weaning, this suggest that lack of NR6 is lethal 

during embryonic development or immediately after birth. 
2 0 Genotyping of embryonic tissues at various stages of 

development suggests that death occurs late in gestation 

(beyond day 16) or at birth. 

EXAMPLE 16 

25 Oligonucleotides 

1943: 

5 ' GTC CAA GTG CGT TGT AAC CCA 3 ' 
2070 : 

5 ' GCT GAG TGT GCG CTG GGT CTC ACC 3 » 
30 2057: 

5 1 GGC TCC ACT CGC TCC AGA 3 ' 



Those skilled in the art will appreciate that the 
invention described herein is susceptible to variations 
and modifications other than those specifically 
described. It is to be understood that the invention 
includes all such variations and modifications. The 
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compositions and compounds referred to or indicated in 
this specification, individually or collectively, and 
any and all combinations of any two or more of said 
> steps or features. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: {Other than US) AMRAD OPERATIONS PTY 

LTD 

(US only) Douglas James HILTON, Nicos Antony 
NICOLA, Alison FARLEY, Tracey WILLSON, Jian-Guo ZHANG, 
Warren ALEXANDER , Steven RAKAR, Louis FABRI , Tetsuo 
KOJIMA, Masatsugu MAEDA, Yasumfumi KIKUCHI, Andrew NASH 

(ii) TITLE OF INVENTION: A NOVEL HAEMPOI ETIN 

RECEPTOR AND GENETIC 
SEQUENCES ENCODING SAME 



(iii) NUMBER OF SEQUENCES: 39 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: DAVIES COLLISON CAVE 

(B) STREET: 1 LITTLE COLLINS STREET 

(C) CITY: MELBOURNE 

(D) STATE: VICTORIA 

(E) COUNTRY: AUSTRALIA 

(F) ZIP: 3000 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version 

#1.25 



(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 

PCT INTERNATIONAL APPLICATION 
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10 



(B) FILING DATE: ll-SEP-1997 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: P02246/96 

(B) FILING DATE: ll-SEP-1996 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: HUGHES DR, E JOHN L 

(C) REFERENCE/DOCKET NUMBER: EJH/AF 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: +61 3 9254 2777 

(B) TELEFAX: +61 3 9254 2770 



15 (2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l 

25 

Trp Ser Xaa Trp Ser 



30 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

5 

ACTCGCTCCA GATTCCCGCC TTTT 

10 (2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
25 TCCCGCCTTT TTCGACCCAT AGAT 

(2) INFORMATION FOR SEQ ID NO:4: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: DNA 



24 



24 



- 73 



SUBSTITUTE SHEET (RULE 26) 



WO mi 1225 PCT/GB97/02479 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

GGTACTTGGC TTGGAAGAGG AAAT 
(2) INFORMATION FOR SEQ ID NO: 5: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
1° (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



15 (^i) SEQUENCE DESCRIPTION: SEQ ID NO:5 

CGGCTCACGT GCACGTCGGG TGGG 
(2) INFORMATION FOR SEQ ID NO: 6: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6 
AGCTGCTGTT AAAGGGCTTC TC 



24 



24 



22 

t 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Oligonucleotide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

(A/G)CTCCA(A/G)TC(A/G) CTCCA ' 15 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: Oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

3 0 (A/G) CTCCA (C/T)TC(A/G) CTCCA 15 



15 



35 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9; 

10 



25 



AAGTGTGACC ATCATGTGGA C 21 



15 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

2 0 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



3 0 GGAGGTGTTA AGGAGGCG 18 



(2) INFORMATION FOR SEQ ID NO: 11: 

3 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

10 

ATGCCCGCGG GTCGCCCG 



15 (2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1506 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



25 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..1242 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GGCACGAGCT TCGCTGTCCG CGCCCAGTGA CGCGCGTGCG GACCCGAGCC CCAATCTGCA -64 

CCCCGCAGAC TCGCCCCCGC CCCATACCGG CGTTGCAGTC ACCGCCCGTT GCGCGCCACC -4 

CCC .3 

ATG CCC GCG GGT CGC CCG GGC CCC GTC GCC CAA TCC GCG CGG CGG CCG 48 
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15 



20 



25 



Met 



Pro Ala Gly Arg Pro Gly Pro Val Ala Gin Ser Ala Arg Arg Pro 



1 5 io 15 

CCG CGG CCG CTG TCC TCG CTG TGG TCG CCT CTG TTG CTC TGT GTC CTC 96 
5 Pro Arg Pro Leu Ser Ser Leu Trp Ser Pro Leu Leu Leu Cys Val Leu 
20 25 30 

GGG GTG CCT CGG GGC GGA TCG GGA GCC CAC ACA GCT GTA ATC AGC CCC 144 
Gly Val Pro Arg Gly Gly Ser Gly Ala His Thr Ala Val He Ser Pro 
10 35 40 45 

CAG GAC CCC ACC CTT CTC ATC GGC TCC TCC CTG CAA GCT ACC TGC TCT 192 
Gin Asp Pro Thr Leu Leu He Gly Ser Ser Leu Gin Ala Thr Cys Ser 
50 55 60 



ATA CAT GGA GAC ACA CCT GGG GCC ACC GCT GAG GGG CTC TAC TGG ACC 24 0 
He His Gly Asp Thr Pro Gly Ala Thr Ala Glu Gly Leu Tyr Trp Thr 
65 7 ° 75 bo 

CTC AAT GGT CGC CGC CTG CCC TCT GAG CTG TCC CGC CTC CTT AAC ACC 288 
Leu Asn Gly Arg Arg Leu Pro Ser Glu Leu Ser Arg Leu Leu Asn Thr 
85 90 95 

TCC ACC CTG GCC CTG GCC CTG GCT AAC CTT AAT GGG TCC AGG CAG CAG 336 
Ser Thr Leu Ala Leu Ala Leu Ala Asn Leu Asn Gly Ser Arg Gin Gin 
1°0 105 no 



TCA GGA GAC AAT CTG GTG TGT CAC GCC CGA GAC GGC AGC ATT CTG GCT 384 

Ser Gly Asp Asn Leu Val Cys His Ala Arg Asp Gly Ser He Leu Ala 

30 US 120 125 

GGC TCC TGC CTC TAT GTT GGC TTG CCC CCT GAG AAG CCC TTT AAC ATC 4 32 

Gly Ser Cys Leu Tyr Val Gly Leu Pro Pro Glu Lys Pro Phe Asn He 
130 135 140 

35 
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AGC TGC TGG TCC CGG AAC ATG AAG GAT CTC ACG TGC CGC TGG ACA CCG 4 80 
Ser Cys Trp Ser Arg Asn Met Lys Asp Leu Thr Cys Arg Trp Thr Pro 
145 150 155 160 

5 GGT GCA CAC GGG GAG ACA TTC TTA CAT ACC AAC TAC TCC CTC AAG TAC 528 

Gly Ala His Gly Glu Thr Phe Leu His Thr Asn Tyr Ser Leu Lys Tyr 
165 170 175 

AAG CTG AGG TGG TAC GGT CAG GAT AAC ACA TGT GAG GAG TAC CAC ACT 576 
10 Lys Leu Arg Trp Tyr Gly Gin Asp Asn Thr Cys Glu Glu Tyr His Thr 

180 185 190 

GTG GGC CCT CAC TCA TGC CAT ATC CCC AAG GAC CTG GCC CTC TTC ACT 624 
Val Gly Pro His Ser Cys His lie Pro Lys Asp Leu Ala Leu Phe Thr 
15 1^5 200 205 

CCC TAT GAG ATC TGG GTG GAA GCC ACC AAT CGC CTA GGC TCA GCA AGA 672 
Pro Tyr Glu He Trp Val Glu Ala Thr Asn Arg Leu Gly Ser Ala Arg 
210 215 220 

20 

TCT GAT GTC CTC ACA CTG GAT GTC CTG GAC GTG GTG ACC ACG GAC CCC 720 
Ser Asp Val Leu Thr Leu Asp Val Leu Asp Val Val Thr Thr Asp Pro 
225 2 30 235 240 

25 CCA CCC GAC GTG CAC GTG AGC CGC GTT GGG GGC CTG GAG GAC CAG CTG 76 8 

Pro Pro Asp Val His Val Ser Arg Val Gly Gly Leu Glu Asp Gin Leu 
245 250 255 



30 



AGT GTG CGC TGG GTC TCA CCA CCA GCT CTC AAG GAT TTC CTC TTC CAA 816 
Ser Val Arg Trp Val Ser Pro Pro Ala Leu Lys Asp Phe Leu Phe Gin 
26 ° 265 270 



GCC AAG TAC CAG ATC CGC TAC CGC GTG GAG GAC AGC GTG GAC TGG AAG 864 
Ala Lys Tyr Gin He Arg Tyr Arg Val Glu Asp Ser Val Asp Trp Lys 
35 275 280 285 
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GTG GTG GAT GAC GTC AGC AAC CAG ACC TCC TGC CGT CTC GCG GGC CTG 912 
Val Val Asp Asp Val Ser Aen Gin Thr Ser Cys Arg Leu Ala Gly Leu 
290 295 300 

5 AAG CCC GGC ACC GTT TAG TTC GTC CAA GTG CGT TGT AAC CCA TTC GGG 960 

Lys Pro Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe Gly 
305 310 315 320 



10 



30 



ATC TAT GGG TCG AAA AAG GCG GGA ATC TGG AGC GAG TGG AGC CAC CCC 1008 
He Tyr Gly Ser Lys Lys Ala Gly He Trp Ser Glu Trp Ser His Pro 
32 5 330 335 



ACC GCT GCC TCC ACC CCT CGA AGT GAG CGC CCG GGC CCG GGC GGC GGG 1056 
Thr Ala Ala Ser Thr Pro Arg Ser Glu Arg Pro Gly Pro Gly Gly Gly 
15 340 345 350 

GTG TGC GAG CCG CGG GGC GGC GAG CCC AGC TCG GGC CCG GTG CGG CGC 1104 
Val Cys Glu Pro Arg Gly Gly Glu Pro Ser Ser Gly Pro Val Arg Arg 
355 360 365 

20 

GAG CTC AAG CAG TTC CTC GGC TGG CTC AAG AAG CAC GCA TAC TGC TCG 1152 
Glu Leu Lys Gin Phe Leu Gly Trp Leu Lys Lys His Ala Tyr Cys Ser 
370 375 380 

25 AAC CTT AGT TTC CGC CTG TAC GAC CAG TGG CGT GCT TGG ATG CAG AAG 1200 

Asn Leu Ser Phe Arg Leu Tyr Asp Gin Trp Arg Ala Trp Met Gin Lys 
3B5 390 395 400 



TCA CAC AAG ACC CGA AAC CAG GTC CTG CCG GCT AAA CTC TAAGGATAGG 124 9 
Ser His Lys Thr Arg Asn Gin Val Leu Pro Ala Lys Leu 
405 410 



CCATCCTCCT GCTGGGTCAG ACCTGGAGGC TCACCTGAAT TGGAGCCCCT CTGTACCATC 1309 

35 TGGGCAACAA AGAAACCTAC CAGAGGCTGG GGCACAATGA GCTCCCACAA CCACAGCTTT 13 6 9 

GGTCCACATG ATGGTCACAC TTGGATATAC CCCAGTGTGG GTAAGGTTGG GGTATTGCAG 1429 
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GGCCTCCCAA CAATCTCTTT AAATAAATAA AGGAGTTGTT CAGGTAAAAA AAAAAAAAAA 14 8 9 

AAAAAAAAAA AAAAAAA 1506 



5 



(2) INFORMATION FOR SEQ ID NO: 13 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 413 amino acids 
10 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Pro Ala Gly Arg Pro Gly Pro Val Ala Gin Ser Ala Arg Arg Pro 
15 10 15 

20 Pro Arg Pro Leu Ser Ser Leu Trp Ser Pro Leu Leu Leu Cys Val Leu 

20 25 30 

Gly Val Pro Arg Gly Gly Ser Gly Ala His Thr Ala Val He Ser Pro 
35 40 45 

25 

Gin Asp Pro Thr Leu Leu He Gly Ser Ser Leu Gin Ala Thr Cys Ser 
50 55 60 

He His Gly Asp Thr Pro Gly Ala Thr Ala Glu Gly Leu Tyr Trp Thr 
30 65 70 75 80 

Leu Asn Gly Arg Arg Leu Pro Ser Glu Leu Ser Arg Leu Leu Asn Thr 
85 90 95 

35 Ser Thr Leu Ala Leu Ala Leu Ala Asn Leu Asn Gly Ser Arg Gin Gin 

100 105 110 
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Ser Gly Asp Asn Leu Val Cys His Ala Arg Asp Gly Ser He Leu Ala 
115 120 125 

Gly Ser Cys Leu Tyr Val Gly Leu Pro Pro Glu Lys Pro Phe Asn He 
130 135 140 

Ser Cys Trp Ser Arg Asn Met Lys Asp Leu Thr Cys Arg Trp Thr Pro 
145 150 155 160 



10 



Gly Ala His Gly Glu Thr Phe Leu His Thr Asn Tyr Ser Leu Lys Tyr 
165 170 175 



Lys Leu Arg Trp Tyr Gly Gin Asp Asn Thr Cys Glu Glu Tyr His Thr 
180 185 190 

15 

Val Gly Pro His Ser Cys His He Pro Lys Asp Leu Ala Leu Phe Thr 
195 200 205 

Pro Tyr Glu He Trp Val Glu Ala Thr Asn Arg Leu Gly Ser Ala Arg 
20 210 215 220 

Ser Asp Val Leu Thr Leu Asp Val Leu Asp Val Val Thr Thr Asp Pro 
225 230 235 240 



25 



Pro Pro Asp Val His Val Ser Arg Val Gly Gly Leu Glu Asp Gin Leu 
245 250 255 



30 



Ser Val Arg Trp Val Ser Pro Pro Ala Leu Lys Asp Phe Leu Phe Gin 
260 265 270 

Ala Lys Tyr Gin He Arg Tyr Arg Val Glu Asp Ser Val Asp Trp Lys 
275 280 285 



Val Val Asp Asp Val Ser Asn Gin Thr Ser Cys Arg Leu Ala Gly Leu 
35 290 295 300 
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Lys Pro Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe Gly 
305 310 315 320 

lie Tyr Gly Ser Lys Lys Ala Gly He Trp Ser Glu Trp Ser His Pro 
5 325 330 335 

Thr Ala Ala Ser Thr Pro Arg Ser Glu Arg Pro Gly Pro Gly Gly Gly 
340 345 350 

10 Val Cys Glu Pro Arg Gly Gly Glu Pro Ser Ser Gly Pro Val Arg Arg 
355 360 365 

Glu Leu Lys Gin Phe Leu Gly Trp Leu Lys Lys His Ala Tyr Cys Ser 
370 375 380 

15 

Asn Leu Ser Phe Arg Leu Tyr Asp Gin Trp Arg Ala Trp Met Gin Lys 
385 390 395 4 00 

Ser His Lys Thr Arg Asn Gin Val Leu Pro Ala Lys Leu 
20 405 410 



(2) INFORMATION FOR SEQ ID NO: 14: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1549 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



35 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1278 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



GGCACGAGCT TCGCTGTCCG CGCCCAGTGA CGCGCGTGCG GACCCGAGCC CCAATCTGCA -65 
CCCCGCAGAC TCGCCCCCGC CCCATACCGG CGTTGCAGTC ACCGCCCGTT GCGCGCCACC -5 
CCCA 



ATG CCC OCG GOT CGC CCG GGC CCC GTC GCC CAA TCC GCG CGG CGG CCG 
Met Pro Ala Gly Arg Pro Gly Pro Val Ala Gin Ser Ala Arg Arg Pro 

15 



1 5 10 



ATA CAT GGA GAC ACA CCT GGG GCC ACC GCT GAG GGG CTC TAC TGG ACC 
He HiB Gly Asp Thr Pro Gly Ala Thr Ala Glu Gly Leu Tyr Trp Thr 
65 70 75 80 

CTC AAT GGT CGC CGC CTG CCC TCT GAG CTG TCC CGC CTC CTT AAC ACC 
Leu Asn Gly Arg Arg Leu Pro Ser Glu Leu Ser Arg Leu Leu Asn Thr 
85 90 95 



48 



96 



CCG CGG CCG CTG TCC TCG CTG TGG TCG CCT CTG TTG CTC TGT GTC CTC 
Pro Arg Pro Leu Ser Ser Leu Trp Ser Pro Leu Leu Leu Cys Val Leu 
2° 25 30 

GGG GTG CCT CGG GGC GGA TCG GGA GCC CAC ACA GCT GTA ATC AGC CCC 144 
Gly Val Pro Arg Gly Gly Ser Gly Ala His Thr Ala Val lie Ser Pro 
35 40 45 

CAG GAC CCC ACC CTT CTC ATC GGC TCC TCC CTG CAA GCT ACC TCC TCT 192 
Gin Asp Pro Thr Leu Leu He Gly Ser Ser Leu Gin Ala Thr Cys Ser 
50 55 60 



240 



288 



TCC ACC CTG GCC CTG GCC CTG GCT AAC CTT AAT GGG TCC AGG CAG CAG 336 
Ser Thr Leu Ala Leu Ala Leu Ala Asn Leu Asn Gly Ser Arg Gin Gin 
!00 105 no 
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TCA GGA GAC AAT CTG GTG TGT CAC GCC CGA GAC GGC AGC ATT CTG GCT 384 
Ser Gly Asp Asn Leu Val Cys His Ala Arg Asp Gly Ser He Leu Ala 
115 120 i 2 5 



10 



30 



GGC TCC TGC CTC TAT GTT GGC TTG CCC CCT GAG AAG CCC TTT AAC ATC 
Gly Ser Cys Leu Tyr Val Gly Leu Pro Pro Glu Lys Pro Phe Asn He 
130 135 140 

AGC TGC TGG TCC CGG AAC ATG AAG GAT CTC ACG TGC CGC TGG ACA CCG 
Ser Cys Trp Ser Arg Asn Met Lys Asp Leu Thr Cys Arg Trp Thr Pro 
145 150 155 160 



432 



460 



528 



576 



GGT GCA CAC GGG GAG ACA TTC TTA CAT ACC AAC TAC TCC CTC AAG TAC 
Gly Ala His Gly Glu Thr Phe Leu His Thr Asn Tyr Ser Leu Lys Tyr 
15 165 170 175 

AAG CTG AGG TGG TAC GGT CAG GAT AAC ACA TGT GAG GAG TAC CAC ACT 
Lys Leu Arg Trp Tyr Gly Gin Asp Asn Thr Cys Glu Glu Tyr His Thr 
180 185 190 

20 

GTG GGC CCT CAC TCA TGC CAT ATC CCC AAG GAC CTG GCC CTC TTC ACT 624 
Val Gly Pro His Ser Cys His He Pro Lys Asp Leu Ala Leu Phe Thr 
195 200 205 

25 CCC TAT GAG ATC TGG GTG GAA GCC ACC AAT CGC CTA GGC TCA GCA AGA 672 

Pro Tyr Glu He Trp Val Glu Ala Thr Asn Arg Leu Gly Ser Ala Arg 
210 215 220 



35 



TCT GAT GTC CTC ACA CTG GAT GTC CTG GAC GTG GTG ACC ACG GAC CCC 720 
Ser Asp Val Leu Thr Leu Asp Val Leu Asp Val Val Thr Thr Asp Pro 
225 230 235 240 

CCA CCC GAC GTG CAC GTG AGC CGC GTT GGG GGC CTG GAG GAC CAG CTG 768 
Pro Pro Asp Val His Val Ser Arg Val Gly Gly Leu Glu Asp Gin Leu 
245 250 255 
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30 



816 



WO 98/11225 PCT/GB97/02479 

AGT GTG CGC TGG GTC TCA CCA CCA GCT CTC AAG GAT TTC CTC TTC CAA 
Ser Val Arg Trp Val Ser Pro Pro Ala Leu Lys Asp Phe Leu Phe Gin 
2*0 265 270 

5 GCC AAG TAC CAG ATC CGC TAC CGC GTG GAG GAC AGC GTG GAC TGG AAG 

Ala Lye Tyr Gin He Arg Tyr Arg Val Glu Asp Ser Val Asp Trp Lys 
275 280 285 

GTG GTG GAT GAC GTC AGC AAC CAG ACC TCC TGC CGT CTC GCG GGC CTG 912 
Val Val Asp Asp Val Ser Asn Gin Thr Ser Cys Arg Leu Ala Gly Leu 
290 295 300 



864 



960 



1008 



AAG CCC GGC ACC GTT TAC TTC GTC CAA GTG CGT TGT AAC CCA TTC GGG 
Lys Pro Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe Gly 
15 305 310 315 320 

ATC TAT GGG TCG AAA AAG GCG GGA ATC TGG AGC GAG TGG AGC CAC CCC 
He Tyr Gly Ser Lys Lys Ala Gly He Trp Ser Glu Trp Ser His Pro 
325 330 335 

20 

ACC GCT GCC TCC ACC CCT CGA AGT GAG CGC CCG GGC CCG GGC GGC GGG 1056 
Thr Ala Ala Ser Thr Pro Arg Ser Glu Arg Pro Gly Pro Gly Gly Gly 
340 345 350 

25 GTG TGC GAG CCG CGG GGC GGC GAG CCC AGC TCG GGC CCG GTG CGG CGC 1104 

Val Cys Glu Pro Arg Gly Gly Glu Pro Ser Ser Gly Pro Val Arg Arg 
355 360 365 



GAG CTC AAG CAG TTC CTC GGC TGG CTC AAG AAG CAC GCA TAC TGC TCG 1152 
Glu Leu Lys Gin Phe Leu Gly Trp Leu Lys Lys His Ala Tyr Cys Ser 
370 375 380 



AAC CTT AGT TTC CGC CTG TAC GAC CAG TGG CGT GCT TGG ATG CAG AAG 1200 
Asn Leu Ser Phe Arg Leu Tyr Asp Gin Trp Arg Ala Trp Met Gin Lys 

400 



35 385 390 395 
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TCA CAC AAG ACC CGA AAC CAG GAC GAG GGG ATC CTG CCT TCG GGC AGA 1248 
Ser His Lys Thr Arg Asn Gin Asp Glu Gly He Leu Pro Ser Gly Arg 
405 410 415 

5 CGG GGT GCG GCG AGA GGT CCT GCC GGT TAAACTCTAA GGATAGGCCA 1295 

Arg Gly Ala Ala Arg Gly Pro Ala Gly 
420 425 



10 



TCCTCCTGCT GGGTCAGACC TGGAGGCTCA CCTGAATTGG AGCCCCTCTG TACCATCTGG 1355 

GCAACAAAGA AACCTACCAG AGGCTGGGGC ACAATGAGCT CCCACAACCA CAGCTTTGGT 1415 

CCACATGATG GTCACACTTG GATATACCCC AGTGTGGGTA AGGTTGGGGT ATTGCAGGGC 14 75 

15 CTCCCAACAA TCTCTTTAAA TAAATAAAGG AGTTGTTCAG GTAAAAAAAA AAAAAAAAAA 1535 

AAAAAAAAAA AAAA 1549 



20 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 425 amino acids 
25 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Pro Ala Gly Arg Pro Gly Pro Val Ala Gin Ser Ala Arg Arg Pro 
1 5 10 15 

35 Pro Arg Pro Leu Ser Ser Leu Trp Ser Pro Leu Leu Leu Cys Val Leu 

20 25 30 



- 87 - 



SUBSTITUTE SHEET (RULE 26) 



WO 88/1 1225 

PCT/GB97/02479 



Gly Val Pro Arg Gly Gly Ser Gly Ala His Thr Ala Val He 
35 40 45 



Ser Pro 



Gin Asp Pro Thr Leu Leu He Gly Ser Ser Leu Gin Ala Thr Cys Ser 

60 



5 50 55 



He His Gly Asp Thr Pro Gly Ala Thr Ala Glu Gly Leu Tyr Trp Thr 
65 70 75 



80 



10 Leu Asn Gly Arg Arg Leu Pro Ser Glu Leu Ser Arg Leu Leu Asn Thr 



85 90 95 



15 



Ser Thr Leu Ala Leu Ala Leu Ala Asn Leu Asn Gly Ser Arg Gin Gin 
100 105 110 

Ser Gly Asp Asn Leu Val Cys His Ala Arg Asp Gly Ser lie Leu Ala 
115 120 125 

Gly Ser Cys Leu Tyr Val Gly Leu Pro Pro Glu Lys Pro Phe Asn He 
20 "0 135 140 



Ser cys Trp Ser Arg Asn Met Lys Asp Leu Thr Cys Arg Trp Thr Pro 

160 



145 150 155 



25 Gly Ala His Gly Glu Thr Phe Leu His Thr Asn Tyr Ser Leu Lys Tyr 

165 170 175 

Lys Leu Arg Trp Tyr Gly Gin Asp Asn Thr Cys Glu Glu Tyr His Thr 
180 185 190 

30 

Val Gly Pro His Ser Cys His lie Pro Lys Asp Leu Ala Leu Phe Thr 
195 200 205 

Pro Tyr Glu lie Trp Val Glu Ala Thr Asn Arg Leu Gly Ser Ala Arg 
35 210 215 220 
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Ser Asp Val Leu Thr Leu Asp Val Leu Asp Val Val Thr Thr 
22S 230 



235 



Asp Pro 
240 



Pro Pro Asp Val His Val Ser Arg Val Gly Gly Leu Glu Asp Gin Leu 
5 245 



250 



255 



Ser Val Arg Trp Val Ser Pro Pro Ala Leu Lys Asp Phe Leu Phe Gin 
260 265 



270 



10 



Ala Lys Tyr Gin lie Arg Tyr Arg Val Glu Asp Ser Val Asp Trp Lys 
275 280 



285 



Val Val Asp Asp Val Ser Asn Gin Thr Ser Cys Arg Leu Ala Gly Leu 
290 295 



300 



15 



Lys Pro Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe Gly 
305 310 



315 



320 



He Tyr Gly Ser Lys Lys Ala Gly He Trp Ser Glu Trp Ser Hi 
20 325 



330 



s Pro 
335 



Thr Ala Ala Ser Thr Pro Arg Ser Glu Arg Pro Gly Pro Gly Gly Gly 
340 345 



350 



25 



Val Cys Glu Pro Arg Gly Gly Glu Pro Ser Ser Gly Pro Val Arg Arg 
355 3 6 o 



365 



Glu Leu Lys Gin Phe Leu Gly Trp Leu Lys Lys His Ala Tyr Cys Ser 
370 375 



30 



380 



Asn Leu Ser Phe Arg Leu Tyr Asp Gin Trp Arg Ala Trp Met Gin Lys 
385 390 395 400 



35 



Ser His Lys Thr Arg Asn Gin Asp Glu Gly lie Leu Pro Ser Gly Arg 
405 410 



415 
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Arg Gly Ala Ala Arg Gly Pro Ala Gly 

*' 420 425 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 938 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



20 



(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..4 68 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



25 



GGC ACC GTT TAC TTC GTC CAA GTG CGT TGT AAC CCA TTC GGG ATC TAT 4 8 
Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe Gly lie Tyr 

1 5 io 15 

GGG TCG AAA AAG GCG GGA ATC TGG AGC GAG TGG AGC CAC CCC ACC GCT 96 



30 Gly Ser Lys Lye Ala Gly He Trp Ser Glu Trp Ser His Pro Thr Ala 
20 25 30 



35 



GCC TCC ACC CCT CGA AGT GAG CGC CCG GGC CCG GGC GGC GGG GTG TGC 14 4 

Ala Ser Thr Pro Arg Ser Glu Arg Pro Gly Pro Gly Gly Gly Val Cys 
35 40 45 
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10 



30 



GAG CCG CGG GGC GGC GAG CCC AGC TCG GGC CCG GTG CGG CGC GAG CTC 192 
Glu Pro Arg Gly Gly Glu Pro Ser Ser Gly Pro Val Arg Arg Glu Leu 
50 55 60 

AAG CAG TTC CTC GGC TGG CTC AAG AAG CAC GCA TAC TGC TCG AAC CTT 240 
Lys Gin Phe Leu Gly Trp Leu Lys Lys His Ala Tyr Cys Ser Asn Leu 
65 70 75 80 

AGT TTC CGC CTG TAC GAC CAG TGG CGT GCT TGG ATG CAG AAG TCA CAC 288 
Ser Phe Arg Leu Tyr Asp Gin Trp Arg Ala Trp Met Gin Lys Ser His 
85 90 g 5 



AAG ACC CGA AAC CAG GTA GGA AAG TTG GGG GAG GCT TGC GTG GGG GGT 336 
Lye Thr Arg Asn Gin Val Gly Lys Leu Gly Glu Ala Cys Val Gly Gly 
15 100 105 no 

AAA GGA GCA GAG GAA GAG AGA GAC CCG GGT GAG CAG CCT CCA CAA CAC 384 
Lys Gly Ala Glu Glu Glu Arg Asp Pro Gly Glu Gin Pro Pro Gin His 
115 120 125 

20 

CGC ACT CTT CTT TCC AAG CAC AGG ACG AGG GGA TCC TGC CCT CGG GCA 432 
Arg Thr Leu Leu Ser Lys His Arg Thr Arg Gly Ser Cys Pro Arg Ala 
130 135 140 

25 GAC GGG GTG CGG CGA GAG GTA AGG GGG TCT GGG TGAGTGGGGC CTACAGCAGT 4 85 

Asp Gly Val Arg Arg Glu Val Arg Gly Ser Gly 
145 150 155 



CTAGATGAGG CCCTTTCCCC TCCTTCGGTG TTGCTCAAAG GGATCTCTTA GTGCTCATTT 545 

CACCCACTGC AAAGAGCCCC AGGTTTTACT GCATCATCAA GTTGCTGAAG GGTCCAGGCT 605 

TAATGTGGCC TCTTTTCTGC CCTCAGGTCC TGCCGGCTAA ACTCTAAGGA TAGGCCATCC 665 

35 TCCTGCTGGG TCAGACCTGG AGG CTC AC CT GAATTGGAGC CCCTCTGTAC CTATCTGGGC 725 

AACAAAGAAA CCTACCATGA GGCTGGGGCA CAATGAGCTC CCACAACCAC AGCTTTGGTC 785 
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CACATGATGG TCACACTTGG ATATACCCCA GTGTGGGTAA GGTTGGGGTA TTGCAGGGCC 84 5 

TCCCAACAAT CTCTTTAAAT AAATAAAGGA GTTGTTCAGG TAAAAAAAAA AAAAAAAAAA 905 

5 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 93B 

(2) INFORMATION FOR SEQ ID NO; 17: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe Gly He Tyr 
20 15 10 is 

Gly Ser Lys Lys Ala Gly He Trp Ser Glu Trp Ser His Pro Thr Ala 
20 25 30 

25 Ala Ser Thr Pro Arg Ser Glu Arg Pro Gly Pro Gly Gly Gly Val Cys 

35 40 45 

Glu Pro Arg Gly Gly Glu Pro Ser Ser Gly Pro Val Arg Arg Glu Leu 
50 55 60 

30 

Lys Gin Phe Leu Gly Trp Leu Lys Lys His Ala Tyr Cys Ser Asn Leu 
65 70 75 80 

Ser Phe Arg Leu Tyr Asp Gin Trp Arg Ala Trp Met Gin Lys Ser His 
35 B5 90 95 
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Lys Thr Arg Asn Gin 
100 

Lys Gly Ala Glu Glu 
5 115 

Arg Thr Leu Leu Ser 
130 

10 Asp Gly Val Arg Arg 
145 



Val Gly Lys Leu Gly Glu 
105 

Glu Arg Asp Pro Gly Glu 
120 

Lys His Arg Thr Arg Gly 
135 

Glu Val Arg Gly Ser Gly 
150 155 



Ala Cys Val Gly Gly 
110 

Gin Pro Pro Gin His 
125 

Ser Cys Pro Arg Ala 
14 0 



15 (2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



25 



30 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..834 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 



CCC ACC CTT CTC ATC GGC TCC TCC CTG CAA GCT ACC TGC TCT ATA CAT 
Pro Thr Leu Leu lie Gly Ser Ser Leu Gin Ala Thr Cys Ser He His 
35 51 55 60 65 
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GGA GAC ACA CCT GGG GCC ACC GCT GAG GGG CTC TAC TGG ACC CTC AAT 

Gly Asp Thr Pro Gly Ala Thr Ala Glu Gly Leu Tyr Trp Thr Leu Asn 
70 75 80 

5 GGT CGC CGC CTG CCC TCT GAG CTG TCC CGC CTC CTT AAC ACC TCC ACC 

Gly Arg Arg Leu Pro Ser Glu Leu Ser Arg Leu Leu Asn Thr Ser Thr 
85 90 95 



10 



30 



CTG GCC CTG GCC CTG GCT AAC CTT AAT GGG TCC AGG CAG CAG TCA GGA 
Leu Ala Leu Ala Leu Ala Asn Leu Asn Gly Ser Arg Gin Gin Ser Gly 
100 105 no 



GAC AAT CTG GTG TGT CAC GCC CGA GAC GGC AGC ATT CTG GCT GGC TCC 
Asp Asn Leu Val Cys His Ala Arg Asp Gly Ser lie Leu Ala Gly Ser 
15 H5 120 125 130 

TGC CTC TAT GTT GGC TTG CCC CCT GAG AAG CCC TTT AAC ATC AGC TGC 
Cys Leu Tyr Val Gly Leu Pro Pro Glu Lys Pro Phe Asn He Ser Cys 
135 140 145 

20 

TGG TCC CGG AAC ATG AAG GAT CTC ACG TGC CGC TGG ACA CCG GGT GCA 
Trp Ser Arg Asn Met Lys Asp Leu Thr Cys Arg Trp Thr Pro Gly Ala 
v 150 155 200 

25 CAC GGG GAG ACA TTC TTA CAT ACC AAC TAC TCC CTC AAG TAC AAG CTG 

His Gly Glu Thr Phe Leu His Thr Asn Tyr Ser Leu Lys Tyr Lys Leu 
205 210 215 



AGG TGG TAC GGT CAG GAT AAC ACA TGT GAG GAG TAC CAC ACT GTG GGG 
Arg Trp Tyr Gly Gin Asp Asn Thr Cys Glu Glu Tyr His Thr Val Gly 
220 225 230 



CCC CAC TCA TGC CAT ATC CCC AAG GAC CTG GCC CTC TTC ACT CCC TAT 
Pro His Ser Cys His He Pro Lys Asp Leu Ala Leu Phe Thr Pro Tyr 
35 235 240 245 250 
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GAG ATC TGG GTG GAA GCC ACC AAT CGC CTA GGC TCA GCA AGA TCT GAT 578 
Glu lie Trp Val Glu Ala Thr Asn Arg Leu Gly Ser Ala Arg Ser Asp 
255 260 265 

GTC CTC ACA CTG GAT GTC CTG GAC GTG GTG ACC ACG GAC CCC CCA CCC 626 
Val Leu Thr Leu Asp Val Leu Asp Val Val Thr Thr Asp Pro Pro Pro 
270 275 280 

GAC GTG CAC GTG AGC CGC GTT GGG GGC CTG GAG GAC CAG CTG AGT GTG 674 
Asp Val His Val Ser Arg Val Gly Gly Leu Glu Asp Gin Leu Ser Val 
285 290 295 



CGC TGG GTC TCA CCA CCA GCT CTC AAG GAT TTC CTC TTC CAA GCC AAG 722 
Arg Trp Val Ser Pro Pro Ala Leu Lys Asp Phe Leu Phe Gin Ala Lys 
15 300 305 310 

TAC CAG ATC CGC TAC CGC GTG GAG GAC AGC GTG GAC TGG AAG GTG GTG 770 
Tyr Gin He Arg Tyr Arg Val Glu Asp Ser Val Asp Trp Lys Val Val 
315 320 325 330 

20 

GAT GAC GTC AGC AAC CAG ACC TCC TGC CGT CTC GCG GGC CTG AAG CCC 818 
Asp Asp Val Ser Asn Gin Thr Ser Cys Arg Leu Ala Gly Leu Lys Pro 
335 340 345 

25 GGC ACC GTT TAC TTC GTC CAA GTG CGT TGT AAC CCA TTC GGG ATC TAT 866 

Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe Gly He Tyr 
350 355 360 

GGG TCG AAA AAG GCG GGA 894 
3 0 Gly Ser Lys Lys Ala Gly 
365 



(2) INFORMATION FOR SEQ ID NO: 19: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 278 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



PCT/GB97/02479 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



Pro Thr Leu Leu He Gly Ser Ser Leu Gin Ala Thr Cys Ser He His 
10 51 55 60 65 

Gly Asp Thr Pro Gly Ala Thr Ala Glu Gly Leu Tyr Trp Thr Leu Asn 
70 75 eo 

15 Gly Arg Arg Leu Pro Ser Glu Leu Ser Arg Leu Leu Asn Thr Ser Thr 
85 90 95 

Leu Ala Leu Ala Leu Ala Asn Leu Asn Gly Ser Arg Gin Gin Ser Gly 
100 105 no 

20 

Asp Asn Leu Val Cys His Ala Arg Asp Gly Ser He Leu Ala Gly Ser 
H5 120 125 130 

Cys Leu Tyr Val Gly Leu Pro Pro Glu Lys Pro Phe Asn He Ser Cys 
25 135 140 145 

Trp Ser Arg Asn Met Lys Asp Leu Thr Cys Arg Trp Thr Pro Gly Ala 
150 155 200 

30 His Gly Glu Thr Phe Leu His Thr Asn Tyr Ser Leu Lys Tyr Lys Leu 
205 210 215 

Arg Trp Tyr Gly Gin Asp Asn Thr Cys Glu Glu Tyr His Thr Val Gly 
220 225 230 

35 

Pro His Ser Cys His He Pro Lys Asp Leu Ala Leu Phe Thr Pro Tyr 
235 240 245 2 50 
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Glu He Trp Val Glu Ala Thr Asn Arg Leu Gly Ser Ala Arg Ser Asp 
255 260 265 

Val Leu Thr Leu Asp Val Leu Asp Val Val Thr Thr Asp Pro Pro Pro 
5 270 275 280 

Asp Val His Val Ser Arg Val Gly Gly Leu Glu Asp Gin Leu Ser Val 
285 290 295 

10 Arg Trp Val Ser Pro Pro Ala Leu Lys Asp Phe Leu Phe Gin Ala Lys 
300 305 310 

Tyr Gin He Arg Tyr Arg Val Glu Asp Ser Val Asp Trp Lys Val Val 
315 320 325 330 

15 

Asp Asp Val Ser Asn Gin Thr Ser Cys Arg Leu Ala Gly Leu Lys Pro 
335 340 345 

Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe Gly He Tyr 
20 350 355 360 

Gly Ser Lys Lys Ala Gly 
365 

25 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 base pairs 
30 (B) TYPE: nucleic acids 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
35 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
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GGCATGAAGG CTTAGGGTGG GGATCGGTAG GACCCATGCA CCCAGAGAAA GGGACTGGTG 60 
GCAACTTTCA AACTCTCTGG GGAAGGAAGA AGGGCTGAAA GAGG 



104 



ATG AAC GGG CTC AGA CAC AGC TGT AAT CAG CCC CCA GGA 143 
Met Asn Gly Leu Arg His Ser Cys Asn Gin Pro Pro Gly 
5 10 



10 (2) INFORMATION FOR SEQ ID NO: 21; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acids 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

20 

Met Asn Gly Leu Arg His Ser Cys Asn Gin Pro Pro Gly 
5 10 

25 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS; 
30 (A) LENGTH; 1930 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA 
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10 



60 



120 



180 



240 



300 



360 



420 



WO 98/1 1225 PCT/GB97/02479 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

GGCACGAGCT TCGCTGTCCG CGCCCAGTGA CGCGCGTGCG GACCCGAGCC CCAATCTGCA 
5 CCCCGCAGAC TCGCCCCCGC CCCATACCGG CGTTGCAGTC ACCGCCCGTT GCGCGCCACC 

CCCAATGCCC GCGGGTCGCC CGGGCCCCGT CGCCCAATCC GCGCGGCGGC CGCCGCGGCC 
GCTGTCCTCG CTGTGGTCGC CTCTGTTGCT CTGTGTCCTC GGGGTGCCTC GGGGCGGATC 
GGGAGCCCAC ACAGCTGTAA TCAGCCCCCA GGACCCCACC CTTCTCATCG GCTCCTCCCT 
GCAAGCTACC TGCTCTATAC ATGGAGACAC ACCTGGGGCC ACCGCTGAGG GGCTCTACTG 
15 GACCCTCAAT GGTCGCCGCC TGCCCTCTGA GCTGTCCCGC CTCCTTAACA CCTCCACCCT 

GGCCCTGGCC CTGGCTAACC TTAATGGGTC CAGGCAGCAG TCAGGAGACA ATCTGGTGTG 4 80 

TCACGCCCGA GACGGCAGCA TTCTGGCTGG CTCCTGCCTC TATGTTGGCT TGCCCCCTGA 

20 

GAAGCCCTTT AACATCAGCT GCTGGTCCCG GAACATGAAG GATCTCACGT GCCGCTGGAC 
ACCGGGTGCA CACGGGGAGA CATTCTTACA TACCAACTAC TCCCTCAAGT ACAAGCTGAG 
25 GTGGTACGGT CAGGATAACA CATGTGAGGA GTACCACACT GTGGGCCCTC ACTCATGCCA 720 

TATCCCCAAG GACCTGGCCC TCTTCACTCC CTATGAGATC TGGGTGGAAG CCACCAATCG 
CCTAGGCTCA GCAAGATCTG ATGTCCTCAC ACTGGATGTC CTGGACGTGG TGACCACGGA 

30 

CCCCCCACCC GACGTGCACG TGAGCCGCGT TGGGGGCCTG GAGGACCAGC TGAGTGTGCG 
CTGGGTCTCA CCACCAGCTC TCAAGGATTT CCTCTTCCAA GCCAAGTACC AGATCCGCTA 
35 CCGCGTGGAG GACAGCGTGG ACTGGAAGGT GGTGGATGAC GTCAGCAACC AGACCTCCTG 1020 

CCGTCTCGCG GGCCTGAAGC CCGGCACCGT TTACTTCGTC CAAGTGCGTT GTAACCCATT 1080 
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540 



600 



660 



780 



840 



900 



960 



10 



30 



GTTCAGGTAA 



1260 



1320 



WO 98/1 1225 PCT/GB97/02479 

CGGGATCTAT GGGTCGAAAA AGGCGGGAAT CTGGAGCGAG TGGAGCCACC CCACCGCTGC 1140 

CTCCACCCCT CGAAGTGAGC GCCCGGGCCC GGGCGGCGGG GTGTGCGAGC CGCGGGGCGG 1200 
5 CGAGCCCAGC TCGGGCCCGG TGCGGCGCGA GCTCAAGCAG TTCCTCGGCT GGCTCAAGAA 
GCACGCATAC TGCTCGAACC TTAGTTTCCG CCTGTACGAC CAGTGGCGTG CTTGGATGCA 

GAAGTCACAC AAGACCCGAA ACCAGGTAGG AAAGTTGGGG GAGGCTTGCG TGGGGGGTAA 13 BO 

AGGAGCAGAG GAAGAGAGAG ACCCGGGTGA GCAGCCTCCA CAACACCGCA CTCTTCTTTC 14 4 0 

CAAGCACAGG ACGAGGGGAT CCTGCCCTCG GGCAGACGGG GTGCGGCGAG AGGTAAGGGG 1500 

15 GTCTGGGTGA GTGGGGCCTA CAGCAGTCTA GATGAGGCCC TTTCCCCTCC TTCGGTGTTG 1560 

CTCAAAGGGA TCTCTTAGTG CTCATTTCAC CCACTGCAAA GAGCCCCAGG TTTTACTGCA 1620 

TCATCAAGTT GCTGAAGGGT CCAGGCTTAA TGTGGCCTCT TTTCTGCCCT CAGGTCCTGC 1680 

20 

CGGCTAAACT CTAAGGATAG GCCATCCTCC TGCTGGGTCA GACCTGGAGG CTCACCTGAA 174 0 

TTGGAGCCCC TCTGTACCTA TCTGGGCAAC AAAGAAACCT ACCATGAGGC TGGGGCACAA 1800 

25 TGAGCTCCCA CAACCACAGC TTTGGTCCAC ATGATGGTCA CACTTGGATA TACCCCAGTG 1860 

TGGGTAAGGT TGGGGTATTG CAGGGCCTCC CAACAATCTC TTTAAATAAA TAAAGGAGTT 1920 



(2) INFORMATION FOR SEQ ID NO: 23: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 560 base pairs 

(B) TYPE: nucleic acid 



100 



1930 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

10 TCCAGGCAGC GGTCGGGGGA CAACCTCGTG TGCCACGCCC GTGACGGCAG CATCCTGGCT 60 

GGCTCCTGCC TCTATGTTGG CCTGCCCCCA GAGAAACCCG TCAACATCAG CTGCTGGTCC 120 

AAGAACATGA AGGACTTGAC CTGCCGCTGG ACGCCAGGGG CCCACGGGGA GACCTTCCTC 180 

15 

CACACCAACT ACTCCCTCAA GTACAAGCTT AGGTGGTATG GCCAGGACAA CACATGTGAG 24 0 

GAGTACCACA CAGTGGGGCC CCACTCCTGC CACATCCCCA AGGACCTGGC TCTCTTTACG 300 

2 0 CCCTATGAGA TCTGGGTGGA GGCCACCAAC CGCCTGGGCT CTGCCCGCTC CGATGTACTC 360 

ACGCTGGATA TCCTGGATGT GGTGACCACG GACCCCCCGC CCGACGTGCA CGTGAGCCGC 420 

GTCGGGGGCC TGGAGGACCA GCTGAGCGTG CGCTGGGTGT CGCCACCCGC CCTCAAGGAT 4 80 

TTCCTTTTTC AAGCCAAATA CCAGATCCGC TACCGAGTGG AGGACAGTGT GGAATGGAAG 54 0 

GTGGTGGACG ATGTGAGCAA 560 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 24 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 91 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

- 101 - 



WO 9.8/11225 

(ii) MOLECULE TYPE: DNA 



PCT/GB97/02479 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1053 



10 



15 



'20 



30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ACC CTC AAC GGG CGC CGC CTG CCC CCT GAG CTC TCC CGT GTA CTC AAC 
Tfar Leu Asn Gly Arg Arg Leu Pro Pro Glu Leu Ser Arg Val Leu Asn 

GCC TCC ACC TTG GCT CTG GCC CTG GCC AAC CTC AAT GGG TCC AGG CAG 
Ala Ser Thr Leu Ala Leu Ala Leu Ala Asn Leu Asn Gly Ser Arg Gin 
20 25 30 

CGG TCG GGG GAC AAC CTC GTG TGC CAC GCC CGT GAC GGC AGC ATC CTG 
Arg Ser Gly Asp Asn Leu Val Cys His Ala Arg Asp Gly Ser lie Leu 
35 40 45 



GCT GGC TCC TGC CTC TAT GTT GGC CTG CCC CCA GAG AAA CCC GTC AAC 
Ala Gly Ser Cys Leu Tyr Val Gly Leu Pro Pro Glu Lys Pro Val Asn 
25 50 S5 60 

ATC AGC TGC TGG TCC AAG AAC ATG AAG GAC TTG ACC TGC CGC TGG ACG 
lie Ser Cys Trp Ser Lys Asn Met Lys Asp Leu Thr Cys Arg Trp Thr 
6S 70 75 80 



CCA GGG GCC CAC GGG GAG ACC TTC CTC CAC ACC AAC TAC TCC CTC AAG 
Pro Gly Ala His Gly Glu Thr Phe Leu His Thr Asn Tyr Ser Leu Lys 
85 90 95 

TAC AAG CTT AGG TGG TAT GGC CAG GAC AAC ACA TGT GAG GAG TAC CAC 
Tyr Lys Leu Arg Trp Tyr Gly Gin Asp Asn Thr Cys Glu Glu Tyr His 
100 105 no 
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4B 



96 



144 



192 



240 



288 
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ACA GTG GGG CCC CAC TCC TGC CAC ATC CCC AAG GAC CTG GCT CTC TTT 
Thr Val Gly Pro His Ser Cys His lie Pro Lys Asp Leu Ala Leu Phe 
115 120 125 



384 



ACG CCC TAT GAG ATC TGG GTG GAG GCC ACC AAC CGC CTG GGC TCT GCC 
Thr Pro Tyr Glu He Trp Val Glu Ala Thr Asn Arg Leu Gly Ser Ala 
130 135 140 



432 



10 



CGC TCC GAT GTA CTC ACG CTG GAT ATC CTG GAT GTG GTG ACC ACG GAC 
Arg Ser Asp Val Leu Thr Leu Asp He Leu Asp Val Val Thr Thr Asp 
145 150 155 160 



480 



15 



CCC CCG CCC GAC GTG CAC GTG AGC CGC GTC GGG GGC CTG GAG GAC CAG 
Pro Pro Pro Asp Val His Val Ser Arg Val Gly Gly Leu Glu Asp Gin 
165 170 175 



528 



20 



CTG AGC GTG CGC TGG GTG TCG CCA CCC GCC CTC AAG GAT TTC CTC TTT 576 
Leu Ser Val Arg Trp Val Ser Pro Pro Ala Leu Lys Asp Phe Leu Phe 
180 185 190 

CAA GCC AAA TAC CAG ATC CGC TAC CGA GTG GAG GAC AGT GTG GAC TGG 624 
Gin Ala Lys Tyr Gin He Arg Tyr Arg Val Glu Asp Ser Val Asp Trp 
195 200 205 



25 AAG GTG GTG GAC GAT GTG AGC AAC CAG ACC TCC TGC CGC CTG GCC GGC 

Lys Val Val Asp Asp Val Ser Asn Gin Thr Ser Cys Arg Leu Ala Gly 
210 215 220 



672 



30 



CTG AAA CCC GGC ACC GTG TAC TTC GTG CAA GTG CGC TGC AAC CCC TTT 
Leu Lys Pro Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe 
225 230 235 240 



720 



35 



GGC ATC TAT GGC TCC AAG AAA GCC GGG ATC TGG AGT GAG TGG AGC CAC 
Gly He Tyr Gly Ser Lys Lys Ala Gly He Trp Ser Glu Trp Ser His 
245 250 255 



768 
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CCC ACA GCC GCC TCC ACT CCC CGC ACT GAG CGC CCG GGC CCG GGC GGC 816 
Pro Thr Ala Ala Ser Thr Pro Arg Ser Glu Arg Pro Gly Pro Gly Gly 
260 265 270 

GGG GCG TGC GAA CCG CGG GGC GGA GAG CCG AGC TCG GGG CCG GTG CGG 864 
Gly Ala Cys Glu Pro Arg Gly Gly Glu Pro Ser Ser Gly Pro Val Arg 
275 280 285 



912 



CGC GAG CTC AAG CAG TTC CTG GGC TGG CTC AAG AAG CAC GCG TAC TGC 
Arg Glu Leu Lys Gin Phe Leu Gly Trp Leu Lys Lys His Ala Tyr Cys 
290 295 300 

TCC AAC CTC AGC TTC CGC CTC TAC GAC CAG TGG CGA GCC TGG ATG CAG 960 
Ser Asn Leu Ser Phe Arg Leu Tyr Asp Gin Trp Arg Ala Trp Met Gin 
305 310 315 320 



1008 



AAG TCG CAC AAG ACC CGC AAC CAG CAC AGG ACG AGG GGA TCC TGC CCT 
Lys Ser His Lys Thr Arg Asn Gin His Arg Thr Arg Gly Ser Cys Pro 
325 330 335 

CGG GCA GAC GGG GCA CGG CGA GAG GTC CTG CCA GAT AAG CTG TAGGGGCTCA 1060 
Arg Ala Asp Gly Ala Arg Arg Glu Val Leu Pro Asp Lys Leu 
340 345 350 



GGCCACCCTC CCTGCCACGT GGAGACGCAG AGGCCGAACC CAAACTGGGG CCACCTCTGT 
ACCCTCACTT CAGGGCACCT GAGCCCCTCA GCAGGAGCTG GGGTGGCCCC TGAGCTCCAA 
CGGCCATAAC AGCTCTGACT CCCACGTGAG GCCACCTTTG GGTGCACCCC AGTGGGTGTG 
TGTGTGTGTG TGAGGGTTGG TTGAGTTGCC TAGAACCCCT GCCAGGGCTG GGGGTGAGAA 
GGGGAGTCAT TACTCCCCAT TACCTAGGGC CCCTCCAAAA GAGTCCTTTT AAATAAATGA 
GCTATTTAGG TGCAAAAAAA AAAAAAAAAA A 



1120 



1180 



1240 



1300 



1360 



1391 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 350 amino acids 
5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

Thr Leu Asn Gly Arg Arg Leu Pro Pro Glu Leu Ser Arg Val Leu Asn 
1 5 io 15 

15 Ala Ser Thr Leu Ala Leu Ala Leu Ala Asn Leu Asn Gly Ser Arg Gin 

20 25 30 

Arg Ser Gly Asp Asn Leu Val Cys His Ala Arg Asp Gly Ser He Leu 
35 40 45 

20 

Ala Gly Ser Cys Leu Tyr Val Gly Leu Pro Pro Glu Lys Pro Val Asn 
50 55 60 

He Ser Cys Trp Ser Lys Asn Met Lys Asp Leu Thr Cys Arg Trp Thr 
25 65 70 75 B0 

Pro Gly Ala His Gly Glu Thr Phe Leu His Thr Asn Tyr Ser Leu Lys 
85 90 95 

30 Tyr Lys Leu Arg Trp Tyr Gly Gin Asp Asn Thr Cys Glu Glu Tyr His 
100 105 no 

Thr Val Gly Pro His Ser Cys His He Pro Lys Asp Leu Ala Leu Phe 
115 120 125 

35 

Thr Pro Tyr Glu He Trp Val Glu Ala Thr Asn Arg Leu Gly Ser Ala 
130 135 140 



- 105 
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Arg Ser Asp Val Leu Thr Leu Asp lie Leu Asp Val Val Thr Thr Asp 
145 150 155 160 

Pro Pro Pro Asp Val His Val Ser Arg Val Gly Gly Leu Glu Asp Gin 
5 165 170 175 

Leu Ser Val Arg Trp Val Ser Pro Pro Ala Leu Lys Asp Phe Leu Phe 
180 185 190 

10 Gin Ala Lys Tyr Gin lie Arg Tyr Arg Val Glu Asp Ser Val Asp Trp 
195 200 205 

Lys Val Val Asp Asp Val Ser Asn Gin Thr Ser Cys Arg Leu Ala Gly 
210 215 220 

15 

Leu Lys Pro Gly Thr Val Tyr Phe Val Gin Val Arg Cys Asn Pro Phe 
225 230 235 240 

Gly lie Tyr Gly Ser Lys Lys Ala Gly lie Trp Ser Glu Trp Ser His 
20 245 250 255 

Pro Thr Ala Ala Ser Thr Pro Arg Ser Glu Arg Pro Gly Pro Gly Gly 
260 265 270 



25 



Gly Ala Cys Glu Pro Arg Gly Gly Glu Pro Ser Ser Gly Pro Val Arg 
275 280 285 



30 



Arg Glu Leu Lys Gin Phe Leu Gly Trp Leu Lys Lys His Ala Tyr Cys 

290 295 300 

Ser Asn Leu Ser Phe Arg Leu Tyr Asp Gin Trp Arg Ala Trp Met Gin 

3 °5 310 315 320 



Lys Ser His Lys Thr Arg Asn Gin His Arg Thr Arg Gly Ser Cys Pro 
35 325 330 335 
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Arg Ala Asp Gly Ala Arg Arg Glu Val Leu Pro Asp Lys Leu 
340 345 350 



5 (2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



15 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26; 



TCCAGGCAGC GGTCGGGGGA CAAC 



(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
35 TTGCTCACAT CGTCCACCAC CTTC 
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WO 98/11225 

(2) INFORMATION FOR SEQ ID NO: 28: 



PCT/GB97/02479 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6663 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CCCAGAACTC TTGGACGCTG AGGCAGGAGG ATTCCCAAGT TTCAAGACAG TGTGTTTCTA 60 

GGTAATGAGA CCCTGTCAAG AAAAGAAAAG AAATAAAGAG ACAAGAAAAT GTTTATAGGC 120 

TGTGAGACAG CTTGGTGGGT AAGGGGCACT TGCCTCCAAT CAAGATGACC TCAGCCCCAT 180 

CCCTAGGAAT CCATGGTAGA AGGAGAAAGC AAACTCGCAG CTGCTGACCT CCATACATGT 240 

GCTCCAATGT GCACACACAC AGGGAGACAT AATCAATTAA TAGGATGTAT TTGCTTAGAT 300 

TTGAGTAGGC ATTTATGACT GATGTTTTAA AATTTTTATT TGATTTTATG AAAATATACC 360 

TGT7TGTATT TGGTTTGGTT TGGTTTGAGT TTTGTTTATT TGAGACAGGG CTTCTCTGTG 4 20 

TAGTCCTGGC TGTCCTTGGA ACTCACTCTG TAGACCAGGC TGGCCTTGAA CTCAGAAATC 4 80 

CGCCTGCTTG TGCTTCCCAA GTGCTTAGAT TAAAGGTGTG CACTGCCATT CAGCAAAATT 540 

GCATACTTTA ACCCCAGTAT TTGGGAGGCA GAGGCAGACT AATGTGTGAA TTCCAGGCTA 600 

GCCAAGGATA CAGAGTGAGA CCCTATTCTT ACCCTCCCCC CCCAAAACCC CAAAATGTAT 660 

TTTGTGCTTG TGTATGTACA TGTGTGTTGC AGCACGTAAA TGTCCAAGGA CAACTTGTAG 720 
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AAGTTCTCTC CGTTCACAGT CTAAGTCCTG AATTCAAACT AAGGTCCTCA GGCTTAGCCA 780 

CAGTCTTCTT TATGTACTGA GCCATTTCAC TGGCCCTGGA TTGACTGATG AATTAATTTT 84 0 

TGAGATAAGG TCTCTTGTAG CTCTAGCTAG GCTCAAACTA TGAACTCCCA AGGTCATCT7 900 

GAGCTGCTGG TACTCTTGCT TCCACCCCAA GTGGTGGAAT GATACTCAGG CAGCACTTCT 960 

CTGGGGAAGG GGCTGGCCTT GGCCTTGATT TTGTTGCCTC AGCTTCAATG AGTGCTTGGG 1020 

TCTCGTTGTT TCTTTTCTTT ATCTGTGAAA TGGGTGAACA CCTGTTCAAG ACTTCCTGAC 1080 

TCTTGAAACA TCCAGGCAGG GTGAGGGACT TGAAGTGGGC TCATCCCATG CCTAACAAAG 114 0 

TGTCGTCTTT GACCCCAGAC ACAGCTGTAA TCAGCCCCCA GGACCCCACC CTTCTCATCG 12 0 0 

GCTCCTCCCT GCAAGCTACC TGCTCTATAC ATGGAGACAC ACCTGGGGCC ACCGCTGAGG 1260 

GGCTCTACTG GACCTTCAAT GGTCGCCGCC TGCCCTCTGA GCTGTCCCGC CTCCTTAACA 1320 

CCTCCACCCT GGCCCTGGCC CTGGCTAACC TTAATGGGTC CAGGCAGCAG TCAGGAGACA 1380 

ATCTGGTGTG TCACGCCCGA GACGGCAGCA TTCTGGCTGG GTCCTGCCTC TATGTTGGCT 144 0 

GTAAGTGGGG CCCCAGACAC TCAGAGATAG ATGGGGGTTG GCAATGACAG ATTTAGAGCC 15 00 

TGGGTCTTCT GTCCTGGGGC AGAGCCATGG GCTCTCACTT GCATGCAGGC ATGGTCATAC 1560 

CCAGCACAGG CATTGCAACT CTAGGGACAG CTGTGGCTGC ACTGTCCCCT GTGTACCCCA 1620 

CAGCTTTAGA AAAGCTGTCA TGTTTTCCTT GTAGTGCCCC CTGAGAAGCC CTTTAACATC 1680 

AGCTGCTGGT CCCGGAACAT GAAGGATCTC ACGTGCCGCT GGACACCGGG TGCACACGGG 1740 

GAGACATTCT TACATACCAA CTACTCCCTC AAGTACAAGC TGAGGTTGGT ACCCAGCCAA 1800 

GCCTTGCTGT GTGACTTCTG GCAATACTTA CCTTCTCTGA TCAAATATGT TCCTGTTTAT 18 60 
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GAACTCAAAA GGGACTCTCG CACCTCCACA GGTGGTACGG TCAGGATAAC ACATGTGAGG 

AGTACCACAC TGTGGGCCCT CACTCATGCC ATATCCCCAA GGACCTGGCC CTCTTCACTC 1980 

5 CCTATGAGAT CTGGGTGGAA GCCACCAATC GCCTAGGCTC AGCAAGATCT GATGTCCTCA 204 0 

CACTGGATGT CCTGGACGTG GGTGAGCCCC CAGTGTCCAC CTGTGTTCTG CCCTAGACCT 2100 

TATAGGGCGC CTCCCCCCCA TCCCCCCAGA CTTTTTGGTT CTTCTAGAGG TCTTAGCCAC 

AGCCACGGTG GTTGCAGGAC AGTGGTTGTT CATAACTTAA TGCAAAGACT TTCCCCCAAG 222 

ACAGTCAAGA TTTTTCCCCT CCCCACCCCC AACACACACA TACACACACA CTCTGCAGAG 

AACACCTGGC CTGACCACCC TCCCTCTCTA CAGCCCAGGT GTTCAGAAGG GAGTCCTAGG 2 34 

GGACTGAGAG GAGGCGCCCA GGTCTGAAGG CGCCCCAGGA AGCCGAGGCC TTGAGCTGGG 

GGGGGGGGCG AGGGTTGGAG GCACGAACTG GATGATCCCT GAGCACAACT GGGCCTAATC 

TAATTAGGGT GTTCCCAGCC CAAAGCAGCC TGGGCCATTT AACCCTTCAA GTGCCTCACT 2520 

GAAGACTCAG GGGAGAGATC AGCTTGTACT CTCTCCATGG TCCCCCAGGA GGGTTCCTGG 2580 

GTGCCCCTGG CTCATTCCCA CATCCAGAGG TTTTGTGTCT TCCTGGCATC TAACCCTCAG 2 64 0 

TTGTGCTCTG TGGCTGGCAC AGCTGCCCCG TGGAGGCTCT TGGTAATGTA CAAGGCATCA 2700 

GAGGTGGACA TGGGATGGGG ATACATAGGG ATGGAGCCAA ATAGCACCTC AAGGTGGGGT 2760 

GATATACAAT AAAGCTTGTC ACCCTGACGC TCAGAAAGCC TACTCATGAT GATCACAATT 

GTTGACATCA CTCTGGGACA TGTAGTGAGA CCCTAGCTCA AAACACAGAC AGTAG CTTTA 

AGAGTCAGCT TGTGACTTAA TACTGGAACT CAGGGCCTAA TAGGTGCTGG GTGATGCTCG 

CCTCACTCCC TGTTTAGTGA GATCTCTGCG CTAATCTCCA CCCCAGCTGG GTGGGCTGCT 
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CTGTCCCCTT GAGGGCAGGA ATGTGTGTCT TCCATCAGAG ATAGGACCCG TGGTAGCAGC 3 060 

AACTGCTGCT GGCTGTTTCT GGAATATTAA ATGACAGTAA TCTATCAGGC CTGGGTGAGT 3120 

AGCTAACAGG GGTGGGGGCG TGGTCTGGAA AACGCAGATA GGGTCATAGG AGCCACTGCA 3180 

GCCTAGATTA CACCACTGGG TGTTCTGTCA CTAGGCCATT CTCACCAAGC AGTCCTCAGA 3240 

ACTGGGAGCA CTGTTGCCAG CATTTAATGC CAGCATTTAA TGCCAGCATT AGGGGAGGCA 3 300 

GAGGCAGAAG GATCTCTCTG AGTTCAAGGC CATCCTGAAT TTACATAAAG AGCTCCAGGC 3360 

CAGCCAGGGT GCGCAGTAAA ACCTTGTCTC AAAAAACAAA GCATCTTTAG TGACCAGGCT 3420 

TGCTCCACCC CCAGTGACCA CGGACCCCCC ACCCGACGTG CACGTGAGCC GCGTTGGGGG 34 80 

CCTGGAGG AC CAGCTGAGTG TGCGCTGGGT CTCACCACCA GCTCTCAAGG ATTTCCTCTT 3 54 0 

CCAAGCCAAG TACCAGATCC GCTACCGCGT GGAGGACAGC GTGGACTGGA AGGTGCCCGT 3600 

CCCGCCCCGG ACCCGCCCCT GACCCCGCCC CCCGCATCTG ACTCCTCCCT CACCGTGCAG 3660 

GTGGTGGATG ACGTCAGCAA CCAGACCTCC TGCCGTCTCG CGGGCCTGAA GCCCGGCACC 3720 

GTTTACTTCG TCCAAGTGCG TTGTAACCCA TTCGGGATCT ATGGGTCGAA AAAGGCGGGA 3780 

ATCTGGAGCG AGTGGAGCCA CCCCACCGCT GCCTCCACCC CTCGAAGTGG TGAGCACCTC 3 84 0 

TCCAGGGCTG GCTGGCCCAT GGAATCCCCA ATCCATCCTG T7CCTTCCCC CCCACCCTTT 3 900 

TTTTGAGACA GCGTCTTCAG GTAGCGCATG CTGGCCTTAA ATTCAGTATG TAGTCAAGGA 3 960 

TGACCTCGAG CTCCTGGTCT TTTTGTCTCC ACTTAGAGAC AATGGCCAGT GGCCATCACC 4 02 0 

ACCTTTGGGA GACTAGCCAT GGAGTCTATT TAGCCTGTCA TTTGGTGACA GATGGAGTAC 4 0 BO 

AACAGTGTGA CCTCTTGTAA GAGAACTGAA GACAGGCTGT TTTTAACCCC AATATCCTAG 414 0 
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GCTCTCTAGA GGTTAACTTT ATATAAAATA GAGACTATTA CAG CCAGTTA TCACATGGTC 4 200 

CCACAGAACC TTTTGTCACA CAACCTATAG ACCACAGTGC CTGTGCCTAC CACATAAGGG 4260 

TCTCTACTGC TGGCCCACCC CTCCAACCCT TAAAAGGTAA CCTAGGCAGC CTTAATATTT 4 320 

GCAATCCTCC TACCTCAGCC TCTTGAATGC TCAGAAACCA GGCATTAACC CAAGTTTCTC 4 380 

TTCTCTGGGT CCCTTTCTTA AGGTGGGAGG GCCTAAAGAT GACTTCCTTT GTCCTGAAGA 444 0 

CTCTCCGAGC CCATGGATCT GCACTCTCTA ATATGAAATA TATTGCATAA AATGTCTGGC 4500 

CTCAGTTTCC CCACCTGTCA GGTTTAGGCA GCACAGTCGG TCCAAGACAC TTCATTATTT 4560 

GCAGGCAGTA TAAGAAGAAG CTCCCATCCC CCACCCGCTT CCTCCGGTCC CTAAGACAGA 4 620 

ATACTTCTAC ACTGAAACTG AACTCTCGCA GACGCATATG CTCACTTTAA TGATGATGAA 4 680 

ATAATGGGGA AACTGAGGCT CCGAGAGATT CCTGGAGGAA GAGGGTCAAA ACCAGCTCCA 4 74 0 

GGAAGCTCTC CAGCCCCCAT CCGGGCCTCT CCAGGTTCTG GGCTTGGCGG GAGTGAACAC 4 800 

AGCTGGGAGG GGCTGGAGCC TGGGAGCTTT GGCCCTTGCT CGTGCCCAGC ACCTGCGATT 4 860 

CTTGCACGGG AG C CAGCAGG CGGCTGCGTC CGCCCGAGAG ACTGAAGAAG CCGGGGGTAG 4 920 

GGTTGGAGGG AGGTAAGCAG GGGCTGTGGG GGCCGAAGCT TGTGCCAGGG CCTGTCAGCG 4 980 

AGTCCCCAGT TTTATTTATG GCGTGAGGCC GATGTCCTTA TCCGCTGGCC TGCTGGGGGA 504 0 

TGGCTGCGGC TGGGGATTGG ACCCAAGGGC TGGCTTCCCA CTCAGTCCTC CAGCCCACTC 5100 

CATGTCACAC CCGTGCATTC TCTGAGGCTT ATCTTGGGAA CCCGCCCTTG TTCTGTGCTG 5160 

TCTGTCTCTA TTTCTGTCAT TCACTTTCCC AGAGCCTTTT TTTTATGCTT TTAATATAAC 5220 

TACGTTTTAA AAATTGCTTT TGTATAATGT GTGTGCCTTC GTGAGCGTGC GTGCCACAAC 5280 
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ACACACGTGA AGGTTAGAGA ACTTTGTTGA GTAGGCTCCT TCCACCATGT GGGACTAGGG 534 0 

CTGGCGACAA GAGCAATTAC TGAGTCATCT CGCCAGCCCC TCACCCCTCA CTTCCCATCC 54 00 

TGTTTGGATA GTCATAGGTA ATCGAAGGTA AATCGCTGGC TTTAATTTCG TAGCTATCCT 54 60 

GCCTCAGCCT ACCAAGTGCT GTGCTACCAC GTTTGTGGGA GGGGCTCTCC TCCCAGTGTC 5520 

TGGGGGTGAC ACAGTCCCAA GATCTCTGCT TTCTAGGTCT TTGTCTTAGT TTGCCCCTTG 5580 

CTTTGTCCGT GTCCCTAGAG TCTCCGGCCC CACTTATCCA TTGACTGGTC TTTCCTTTAC 564 0 

CGAATACTCG GTTTTACCTC CCACTGATTT GACTCCCTCC TTTGCTTGTC TCCATCGCCG 5700 

TGGCATTGCC ATTCCTCTGG GTGACTCTGG GTCCACACCT GACACCTTTC CCAACTTTCC 5760 

CCAGCCGAAG CTGGTCTGGT ATGGGAGGCC GCCGTCCCGC GCGCGCCTCC TGCTGGCCGC 5820 

GCCCCAACAC TGCCGCTCCA TTCTCTTTAG AGCGCCCGGG CCCGGGCGGC GGGGTGTGCG 5880 

AGCCGCGGGG CGGCGAGCCC AGCTCGGGCC CGGTGCGGCG CGAGCTCAAG CAGTTCCTCG 594 0 

GCTGGCTCAA GAAGCACGCA TACTGCTCGA ACCTTAGTTT CCGCCTGTAC GACCAGTGGC 6000 

GTGCTTGGAT GCAGAAGTCA CACAAGACCC GAAACCAGGT AGGAAAGTTG GGGGAGGCTT 6060 

GCGTGGGGGG TAAAGGAGCA GAGGAAGAGA GAGACCCGGG TGAGCAGCCT CCACAACACC 6120 

GCACTCTTCT TTCCAAGCAC AGGACGAGGG GATCCTGCCC TCGGGCAGAC GGGGTGCGGC 6180 

GAGAGGTAAG GGGGTCTGGG TGAGTGGGGC CTACAGCAGT CTAGATGAGG CCCTTTCCCC 624 0 

TCCTTCGGTG TTGCTCAAAG GGATCTCTTA GTGCTCATTT CACCCACTGC AAAGAGCCCC 63 00 

AGGTTTTACT GCATCATCAA GTTGCTGAAG GGTCCAGGCT TAATGTGGCC TCTTTTCTGC 6360 

CCTCAGGTCC TGCCGGCTAA ACTCTAAGGA TAGGCCATCC TCCTGCTGGG TCAGACCTGG 6420 

- 113 - 



SUBSTITUTE SHEET (RULE 26) 



WO 98/1 1225 PCT/GB97/02479 
AGGCTCACCT GAATTGGAGC CCCTCTGTAC CATCTGGGCA ACAAAGAAAC CTACCAGAGG 64 80 

CTGGGCACAA TGAGCTCCCA CAACCACAGC TTTGGTCCAC ATGATGGTCA CACTTGGATA 654 0 

TACCCCAGTG TGGGTAGGGT TGGGGTATTG CAGGGCCTCC CAAGAGTCTC TTTAAATAAA 6600 

TAAAGGAGTT GTTCAGGTCC CGATGGCCAG TGTGTTTGGG GCCTATGTGC TGGGGTGGGG 6660 

GGA 

6663 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 186 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Asp Pro Thr Leu Leu He Gly Ser Ser Leu Gin Ala Thr Cys Ser lie 

His Gly Asp Thr Pro Gly Ala Thr Ala Glu Gly Leu Tyr Trp Thr Phe 
20 25 30 

Asn Gly Arg Arg Leu Pro Ser Glu Leu Ser Arg Leu Leu Asn Thr Ser 
35 40 45 

Thr Leu Ala Leu Ala Leu Ala Asn Leu Asn Gly Ser Arg Gin Gin Ser 
50 55 60 
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Gly Asp Asn Leu Val Cys His Ala Arg Asp Gly Ser lie Leu Ala Gly 
65 70 75 80 

Ser Cys Leu Tyr Val Gly Leu Pro Pro Glu Lys Pro Phe Asn He Ser 
5 85 90 95 

Cys Trp Ser Arg Asn Met Lys Asp Leu Thr Cys Arg Trp Thr Pro Gly 
100 105 no 

10 Ala His Gly Glu Thr Phe Leu His Thr Asn Tyr Ser Leu Lys Tyr Lys 
115 120 125 

Leu Arg Leu Val Arg Ser Gly » His Met * Gly Val Pro His Cys 
130 135 140 

15 

Gly Pro Ser Leu Met Pro Tyr Pro Gin Gly Pro Gly Pro Leu His Ser 
145 1S0 155 i 60 

Leu * Asp Leu Gly Gly Ser His Gin Ser Pro Arg Leu Ser Lys lie 
20 165 170 i 75 

* Cys Pro His Thr Gly Cys Pro Gly Arg 
1B0 las 

25 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 35 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
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AGCTGGCGCG CCTCCCGGGC GGATCGGGAG CCCAC 



35 



5 (2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

0 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



20 AGCTACGCGT TTAGAGTTTA GCCGGCAG 



(2) INFORMATION FOR SEQ ID NO: 32: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



28 



30 



35 



(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Val Leu Ala Ser Ser Thr Thr Ser lie His Thr Met Leu Leu Leu 
15 10 15 
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Leu Leu Met Leu Phe His Leu Gly Leu Gin Ala Ser lie Ser 
20 " 30 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



He Ly, Pro Ser Gly Arg Arg Gly Ala Ala Arg Gly Pro Ala Gly Asp Tyr Lys Asp Asp 
S 10 IS 20 

Asp Asp Lys 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



5 GATCTTGCCC TCGGGCAGAC GGGGTGCGGC GAGAGGTCCT GCCGGCGACT ACAAGGACGA 60 

CGATGACAAG TAG 

73 

10 (2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



25 

AACGGGAGCC CGTCTGCCCC ACGCCGCTCT CCAGGACGGC CGCTGATGTT CCTGCTGCTA 
CTGTTCATCC TAG 

30 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 27 base pairs 
35 <B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 



PCT/GB97/02479 



5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

CCCACGCTTC TCATCGGATT CTCCCTG 



10 (2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

25 

CAGTCCACAC TGTCCTCCAC TCGGTAG 



3 0 (2) INFORMATION FOR SEQ ID NO: 38: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11832 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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PCT/CB97/02479 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

GCGGCCGCTG CACTGATTAC TCACCGCGTG GCGCACCCCA CCCGCGGGCC GCTGAGTGGA 60 

TTTTTCCGTG GGGGGATGTG AAGAAGTTTA GGGAGAACTC TTCTGCACCG ATGGGAACTA 12c 

GGAATGCAGG GTTCGGTCCC GTTCCCCAAA GGACACACCT CTCCCCATAA GCCCACTCAT 180 

AAGGGCTCCC TGCACGCGCT CCGGGACATC CCCATATCCA ATACCCGCAG ATATGATAGT 240 

TGAGAAGGGA CCAGAGGCCG GAGACTCCCT CCCTGCCTTC TGGCTTTCCC CCCCCCCTGC 300 

ACGAAACGAG ACTACAGCGA TGGGAGAGGT GGCATGAAGG CTTAGGGTGG GGATCGGTAG 360 

GACCCATGCA CCCAGAGAAA GGGACTGGTG GCAACTTTCA AACTCTCTGG GGAAGGAAGA 420 

AGGGCTGAAA GAGGATGAAC GGGCTCAGGT ACTGCTCAAT GTGTGTGTGG CGGACCAAAG 4 80 

TGGGTATGGG GGCCCCGTAA GAGGGGCGGG GAAGGTGGAT AGGAAGGATC CCGGTAGACT 54 0 

GGAGGGGATC CTGGAAAAGC ACCAGGGCTG CGAGCTAGGA ACCCATTCGG AGTTAAGGGT 600 

ACAGGATCCC AGATGAGGGG GTGGGAAGCC TGGGACGGGC GGGACCAGAG AGGGAGGTCC 660 

CACGGGCTGG TGGGGAAAGA GTGGGGGG CT TCGCGCAGGA GGATGGGACG TTCAGGAG TG 7 20 

GTAACTGGGC GGAGGCCGGC CGGGCGGGGC GCGCGGTGCC CGCGGGCGGT GGGAAGGCCG 780 

GTGCGGGGCC CACGATCAAC CCCCCCCCAG GGGCCGGGCC GGGCCGGGGG CGGGGCCGGG 840 

CGGGGCGAGC GGCGCATTAG CGCCTTGTCA ATTTCGGCTG CTCAGACTTG CTCCGGCCTT 900 

CGCTGTCCGC GCCCAGTGAC GCGCGTGAGG ACCCGAGCCC CAATCTGCAC CCCGCAGACT 960 

CGCCCCCGCC CCATACCGGC GTTGCAGTCA CCGCCCGTTG CGCGCCACCC CCATGCCCGC 1020 

GGGTCGCCCG GGCCCCGTCG CCCAATCCGC GCGGCCGCCG CCGCGGCCGC TGTCCTCGCT 1080 
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GTGGTCGCCT CTGTTGCTCT GTGTCCTCGG GGTGCCTCGG GGCGGATCGG GAGCCCGTGA 114 0 

GTACCGTGCG CCCTGCTCCC CACCTCCCCA GGGAAGCCGG GATCCGGCGC CCCGGGGGGT 1200 

AGTCGCGGGG GATGGAAGAA GGGGCGCGAG CGCCACCTGG ACGTCCCGGG AACAAAGGAA 1260 

GGCGGCCCTC GGGGCGCCCT CACCTGTGGG GCTCATGGCA CCACCACCCA GCCTCCCAAG 1320 

AGTACCCCGT TATACATCAG AGGCCTCTTA TCTGTATCCC CTTTGCGAGG CTGTCTGGCC 1380 

AGGCTCAGTT TGAAGGACAT CGCAGTGTCC TGGGACCCCC CTCCTTCAGG GTGCTGGGAC 144 0 

GCTTCGGGGC GCACGCCTGT GTCTTGGATA TCAGAGCGGA AGGGAAGCCT CCCTGGCCGG 1500 

GGGCGCACGC TTGGGTGCGT TGGGTTGGGT GCTGGCGCAA AGTGGGGTCC CCTCCCCCAT 1560 

GAAGTGATGA TCCCCGGGGG GAGGGTGGGG CGTTATCGTG AGCCCTCCTG TCCGCCTGGC 1620 

ATGCGGCCCG GCGTCCCTCG GGACTTGCCT CTCCGTGGGG TCGGCGCCGC CCCCTCCCCC 1680 

CTATAGCAGA CTCCATGCTT TGGTATCCTC GAAGTCCTCT CCACTGGTGG GGCTCACAAC 1740 

CGGTCTCATT CAGGCTGCGC TGGGTTGAGA GCCTCTAGCG ACTGAAATTT CGGTGAGGAG 1800 

CGAGAGCAAG CGTGTCCGGG CACCGCGAGC CCAGACTTCA TTGTCTAAGG GGCACCCAGT 1860 

GGGGGTCAGC TGCCGAGAGA ATCCCACTGT CCCAGGAGGA ACTCCTGGCC TTGAGCCCCC 1920 

ATCACCCAAC GCACACATCC CCGCCAGGAT GCGGTCTCCA CATCCAGACC CTCTCTGGGA 1980 

CACACCCAAA GACACACAAA AGAGCCCCAC TGGCTTATGT CCCGTCACCC TGCCCTCCGA 204 0 

CGCGCGCTGC AGCCCAGATG CGTATTCGCA CACCATCGCG GCGCTCGCAT TCCATCCTCT 2100 

ACACACACAC ACACACACAC ACACACACAC ACACACACAC ACACACAGAC ACGCACACAC 2160 

ACACGCACGC ACACACACGC ACGCCCGCAC TCGTGGTCCC ACATTTATTT CACAGGGGAG 2220 
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GCAACACCGG GGTACGCATA TGGTTGAGTG CACTGGAGAT CTTTCCCCAC CACTCTCAGG 2280 
ACCCCATCCG GAGACACAGG CCACACCGCA GGGGCACCAC GCTGCGCTGC TGCTCTGGGC 234 0 
TAGTAGTCTT GTGCAGTTTG TCCGCGGTGT CTGTGGACGC CCTCCCGCTC TTGTCAGGGG 2400 
ACAGGAACCT ACACTCCTGC TTGCCCAAGG CGGCTGGGCA GGTGATGTGG TGACACCCGG 
GACCTTTCCG GGGAGTTGGT GTTGCTGCCA AGCCTGGGTA GTTTTTGAAT GCCACCAATA 
GCGCTAAGCT TTGTTTCCGG GCGGGCTGCA GAGCAACAGG CGAAGGTGGC GGAGTGGGGG 
TGGCGCGTGT GTTTTTTCTT TTAAGGGGGA GAGAAATTAA ATAAGAGGTT CTCACACCTC 
TGCAATCTGT TTGTACTTAC CGTGTGTCTT AACACCTGAC CAGCCAGCCG GTGGGTCGTA 
AAAGTGTATG CAGGTACCAG CGGGACAGGA GATGGGGGCC CCTGGGGTAT GGCTGGGATG 
GAGGCCACCT TCCCGTTGGC CTTTCAGGGA ATCTCACACT TTTCCCTTTT AAAACACATG 
GTGTTCTTTT TAATAACGGC AGCAACTCCG CATTGGGAAA GGGGGAAATA AGCTTGTATA 
GGCCCCGGCT TTGTGGAAAG GAGGGGAAGA GGGAAGAAAA AAGGAGGGGT GTCTCCTCCA 
GGCTTAGGGG GCTGTCAGCT GCTGCTCTGT CTAGCTTGGC ATGTG7GTGC CCCAGTCCCC 
AGTGGCTTTG GCCCATTGTT TGTGGAAGCC AAGAGGGAGA CTGGAGTCCT CTATCTCTGG 
TACTCCAGAG TCAGGCTTCT CAGTCCGAGC CCAGAGAACG TCTTCCCTGT TTTATGGAGG 
GAATCAGGGA AGGGGGTGCC AGGTGGACTA CGTTCTGCTG AGGACTGTAC CAGTCGCTCG 3XB0 
AAGGAGAAAG CTTGGGCTTG CCCCCCTCCC CCCTCAAGCC ACGAAGGGCA GCTGCTAGGC 
TAGTGTGGTA AAAGGGCATT ACTCCCCAGC CAGGACCCCC CAGAGAGTCC CCTTCCTGGC 
CAGACAAATG CTGGGGAGGG ACAGAGGGGT GTGATCATTG CCCAGGAGTG CAGACAGTGG 



2460 



2520 



2580 



2640 



2700 



2760 



2820 



28S0 



2940 



3000 



3060 



3120 



3240 



3300 



3360 



122 - 



SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 PCT/GB97/02479 

GGTCCCGGGT CGGGCAGTGC CTCCCACCCT GCTGAGGGGG GCGCCCAGGC AGGAAGCGGT 3420 

GGGTGGGCCG GGGTAGAGAC GCTGGCACGT CCCAGTTCAT GCCGAAGGAA TTCTGAATTA 3480 

GCGGGCGGCT GGCTGCCTGG GACCTCCGGG GCGGCCCCCT GCCCCCCGCC GCTCCGTCTG 3 540 

GCCTGCTCCT CCTGCTCCTT CGCACGGACG CTGAGACCTC CGCTGAGCCC TGGGACAAGC 3 600 

CCCAAATGCA ACTGCGATTG CAGGCTTCGC AAGACCCGCC TCCTCCCAAG GCCAAATTTG 3 660 

CCTGGGAGAA GTCATTCAGG GCCCAGACTA GAACCATGTT GGTGCCACCT CATCCATCTG 3720 

GGGCATGAAG GACCGTCCAG GGCTGCAGTT TAGCTTCTTA ATAGGAACCT GGGGGTGGGT 3 780 

GCAGCCTCTG TTCTCCGAGC CTCTTTGGAA ATCGGTTTTG TTTTTGTTTT TG TTTTTT CC 3840 

AATACTCTTT TCCTCTCATC CCATCCCGGG ACTGTTTTCC TCCCTAAGGG TTGAGAGCCC 3900 

TGCAGTCTTC CCTAACCTTT TCTTTGCTTC TACCCCAGGG CCTTTGCACA TGGAGTCCCA 3960 

CCTCTCCCCT TGCCCAACTG GGGCTCCAGC CTTACTGCAT TTGGCTCTTG GTAACTGTCC 4 020 

CAGGGCCTCT CTGACACACA GGGTTGTAGC CCCAGCTCCC TCTCTTCTCC TCCCCCCTTT 4080 

CTCTTTTGCT TCTGAGACTT AATTTTTTTC rTTTTCTTTT TGGCTTTTTG AGACAGGGTT 414 0 

TCTCTGTACA GCCCTGGCTG CCCTGGCACT CATTCTGTAG ACCAGGCTAG CCTCAAACTC 4200 

ACAAACCTAC CTGCCTCTGC CTTTCCAGTG CTGGCACTAA AGATGTGGGC CACCACAACT 4260 

AGTAGTTAAG TGTTTTGCTG TGTCTTTATT CCTATAGTGA CCTCAGTTCC TGGCATATTG 4 320 

TAGGCGATGG ATGGATGAAT GGATGGATGG ATGGATGGAT GGATGGTTGG ATGGAGCAAG 4380 

CTTGAATCGT CCTGAGTGAA AAAAGAGACC TCAGAGAACT GAATGGAGTT AGGTTCCCAG 4440 

GGCAGCCTGG CCTGCTGGTC TCATGGGAGC TCCCTGTGAA ACTTCCCCCA CACCTCCCAC 4500 
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CACCCTGCCA TCCTGTGTGG CTGACAAGAA AGGCCAATGG CCAGATGGGG ACACAGACTC 4560 

AGGGAAGCTT GGAATATGTT CCCCTCCTCA TATCCTAGGC CTTGTTGTCC CCCTGAGGGC 4620 

5 CCAGCCTATG AGTAGGGCAG CTGTGGGCTG CCCTAAGGTT GGGTAGGCAA GAAGGGGGTG 4680 

GTCCCTCAGG GTGGGTCACA GGATTGAGGT CATTTCCAAA GTGGCCATCA CAGTGGCCCT 4 74 0 

AGGAAATGAT TGTGGAGAGT CAGAACTCCT GTTGGGAGTT GTAGAGGGCC TTGCATGTGG 4 800 

GCTTCTGTGG CTGTCCCTTC TCTTGTGGTC CTTTGCACAG TCCCCTCGTG TGTGCTGGGA 4 860 

TGTGAGGAGG GCACGGGGAA AATGAAGGCT CAGCCCCTCA GCTTGCCCTT CACGGTTCAC 4920 

15 CCAACAGGGC TCACCTCTCC TCTGGACAGG CTCTCACTGT ATGCACAGAT TGGCCTCACA 4 980 

TTTGATTCCC TTCCTTTGGT CTCCTGGGAT GACAAACATT TACCAGGGTA GGATTTTACA 5040 

TTTTAGATAT GTCCATTCTC CAGAAACACA CTTGTGAGGT TAGGGTATCA GTGAAAGGAC 5100 

20 

ACCACCAGGA CAGACAAAGA ATTGGAGAGG AAGGAAATTG GTAAGCCAGG CCATGCTTGA 5160 

TGGCTTATGT GTAATCCCAG AACTCTGGAC GCTGAGGCAG GAGGATTCCA AGTTTCAAGA 5220 

25 CAGTGTGTTC TAGGTAATGA GACCCTGTCA AGAAAAGAAA AGAAATAAAG AGACAAGAAA 5280 

ATGTTTATAG GCTGTGAGAC AGCTTGGTGG GTAAGGGGCA CTTGCCTCCA ATCAAGATGA 5340 

CCTCAGCCCC ATCCCTAGGA ATCCATGGTA GAAGGAGAAA GCAAACTCCA GCTGCTGACC 5400 

30 

TCCATACATG TGCTCCAATG TGCACACACA CAGGGAGACA TAATCAATTA ATAGGATGTA 5460 

TTTGCTTAGA TTTGAGTAGG CATTTATGAC TGATGTTTTA AAATTTTTAT TTGATTTTAT 5520 

35 GAAAATATAC CTGTTTGTAT TTGGTTTGGT TTGGTTTGAG TTTTGTTTAT TTGAGACAGG S580 

GCTTCTCTGT GTAGTCCTGG CTGTCCTTGG AACTCACTCT GTAGACCAGG CTGGCCTTGA 564 0 
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ACTCAGAAAT CCGCCTGCTT GTGCTTCCCA AGTGCTTAGA TTAAAGGTGT GCACTGCCAT 

TCAGCAAAAT TGCATACTTT AACCCCAGTA TTTGGGAGGC AGAGGCAGAC TAATGTGTGA 

ATTCCAGGCT AGCCAAGGAT ACAGAGTGAG ACCCTATTCT TACCCTCCCC CCCCAAAACC 

CCAAAATGTA TTTTGTGCTT GTGTATGTAC ATGTGTGTTG CAGCACGTAA ATGTCCAAGG 5880 

ACAACTTGTA GAAGTTCTCT CCGTTCACAG TCTAAGTCCT GAATTCAAAC TAAGGTCCTC 5940 

AGGCTTAGCC ACAGTCTTCT TTATGTACTG AGCCATTTCA CTGGCCCTGG ATTGACTGAT 6000 

GAATTAATTT TTGAGATAAG GTCTCTTGTA GCTCTAGCTA GGCTCAAACT ATGAACTCCC 6060 

AAGGTCATCT TGAGCTGCTG GTACTCTTGC TTCCACCCCA AGTGGTGGAA TGATACTCAG 6120 

GCAGCACTTC TCTGGGGAAG GGGCTGGCCT TGGCCTTGAT TTTGTTGCCT CAGCTTCAAT 6180 

GAGTGCTTGG GTCTCGTTGT TTCTTTTCTT TATCTGTGAA ATGGGTGAAC ACCTGTTCAA 6240 

GACTTCCTGA CTCTTGAAAC ATCCAGGCAG GGTGAGGGAC TTGAAGTGGG CTCATCCCAT 6300 

GCCTAACAAA GTGTCGTCTT TGACCCCAGA CACAGCTGTA ATCAGCCCCC AGGACCCCAC 6360 

CCTTCTCATC GGCTCCTCCC TGCAAGCTAC CTGCTCTATA CATGGAGACA CACCTGGGGC 6420 

CACCGCTGAG GGGCTCTACT GGACCTTCAA TGGTCGCCGC CTGCCCTCTG AGCTGTCCCG 64 80 

CCTCCTTAAC ACCTCCACCC TGGCCCTGGC CCTGGCTAAC CTTAATGGGT CCAGGCAGCA 654 0 

GTCAGGAGAC AATCTGGTGT GTCACGCCCG AGACGGCAGC ATTCTGGCTG GCTCCTGCCT 6600 

CTATGTTGGC TGTAAGTGGG GCCCCAGACA CTCAGAGATA GATGGGGGTT GGCAATGACA 6660 

GATTTAGAGC CTGGGTCTTC TGTCCTGGGG CAGAGCCATG GGCTCTCACT TGCATGCAGG 672 0 

CATGGTCATA CCCAGCACAG GCATTGCAAC TCTAGGGACA GCTGTGGCTG CACTGTCCCC 67B0 
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TGTGTACCCC ACAGCTTTAG AAAAGCTGTC ATGTTTTCCT TGTAGTGCCC CCTGAGAAGC 684 0 

CCTTTAACAT CAGCTGCTGG TCCCGGAACA TGAAGGATCT CACGTGCCGC TGGACACCGG 6900 
GTGCACACGG GGAGACATTC TTACATACCA ACTACTCCCT CAAGTACAAG CTGAGGTTGG 6960 
TACCCAGCCA AGCCTTGCTG TGTGACTTCT GGCAATACTT ACCTTCTCTG ATCAAATATG 7020 
TTCCTGTTTA TGAACTCAAA AGGGACTCTC GCACCTCCAC AGGTGGTACG GTCAGGATAA 7080 

CACATGTGAG GAGTACCACA CTGTGGGCCC TCACTCATGC CATATCCCCA AGGACCTGGC 7140 

CCTCTTCACT CCCTATGAGA TCTGGGTGGA AGCCACCAAT CGCCTAGGCT CAGCAAGATC 7200 

TGATGTCCTC ACACTGGATG TCCTGGACGT GGGTGAGCCC CCAGTGTCCA CCTGTGTTCT 7260 

GCCCTAGACC TTATAGGGCG CCTCCCCCCC ATCCCCCCAG ACTTTTTGGT TCTTCTAGAG 7320 

GTCTTAGCCA CAGCCACGGT GGTTGCAGGA CAGTGGTTGT TCATAACTTA ATGCAAAGAC 7380 

TTTCCCCCAA GACAGTCAAG ATTTTCCCCT CCCCACCCCC AACACACACA TACACACACA 744 0 

CTCTGCAGAG AACACCTGGC CTGACCACCC TCCCTCTCTA CAGCCCAGGT GTTCAGAAGG 7500 

GAGTCCTAGG GGACTGAGAG GAGGCGCCCA GGTCTGAAGG CGCCCCAGGA AGCCGAGGCC 7560 

TTGAGCTGGG GGGGGGGGCG AGGGTTGGAG GCACGAACTG GATGATCCCT GAGCACAACT 7620 

GGGCCTAATC TAATTAGGGT GTTCCCAGCC CAAAGCAGCC TGGGCCATTT AACCCTTCAA 7680 

GTGCCTCACT GAAGACTCAG GGGAGAGATC AGCTTCTACT CTCTCCATGG TCCCCCAGGA 774 0 

GGGTTCCTGG GTGCCCCTGG CTCATTCCCA CATCCAGAGG TTTTGTGTCT TCCTGGCATC 7800 

TAACCCTCAG TTGTGCTCTG TGGCTGGCAC AGCTGCCCCG TGGAGGCTCT TGGTAATGTA 7860 

CAAGGCATCA GAGGTGGACA TGGGATGGGG ATACATAGGG ATGGAGCCAA ATAGCACCTC 7 920 
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AAGGTGGGGT GATATACAAT AAAGCTTGTC ACCCTGACGC TCAGAAAGCC TACTCATGAT 7980 

GATCACAATT GTTGACATCA CTCTGGGACA TGTAGTGAGA CCCTAGCTCA AAACACAGAC 804 0 

AGTAGCTTTA AGAGTCAGCT TGTGACTTAA TACTGGAACT CAGGGCCTAA TAGGTGCTGG 8100 

GTGATGCTCG CCTCACTCCC TGTTTAGTGA GATCTCTGCG CTAATCTCCA CCCCAGCTGG 8160 

GTGGGCTGCT CTGTCCCCTT GAGGGCAGGA ATGTGTGTCT TCCATCAGAG ATAGGACCCG 8220 

TGGTAGCAGC AACTGCTGCT GGCTGTTTCT GGAATATTAA ATGACAGTAA TCTATCAGGC 8280 

CTGGGTGAGT AGCTAACAGG GGTGGGGGCG TGGTCTGGAA AACGCAGATA GGGTCATAGG 8340 

AGCCACTGCA GCCTAGATTA CACCACTGGG TGTTCTGTCA CTAGGCCATT CTCACCAAGC 84 00 

AGTCCTCAGA ACTGGGAGCA CTGTTGCCAG CATTTAATGC CAGCATTTAA TGCCAGCATT 84 60 

AGGGGAGGCA GAGGCAGAAG GATCTCTCTG AGTTCAAGGC CATCCTGAAT TTACATAAAG 8520 

AGCTCCAGGC CAGCCAGGGT GCGCAGTAAA ACCTTGTCTC AAAAAACAAA GCATCTTTAG 8580 

TGACCAGGCT TGCTCCACCC CCAGTGACCA CGGACCCCCC ACCCGACGTG CACGTGAGCC 8640 

GCGTTGGGGG CCTGGAGGAC CAGCTGAGTG TGCGCTGGGT CTCACCACCA GCTCTCAAGG 8700 

ATTTCCTCTT CCAAGCCAAG TACCAGATCC GCTACCGCGT GGAGGACAGC GTGGACTGGA 8760 

AGGTGCCCGT CCCGCCCCGG ACCCGCCCCT GACCCCGCCC CCCGCATCTG ACTCCTCCCT 8820 

CACCGTGCAG GTGGTGGATG ACGTCAGCAA CCAGACCTCC TGCCGTCTCG CGGGCCTGAA 8 880 

GCCCGGCACC GTTTACTTCG TCCAAGTGCG TTGTAACCCA TTCGGGATCT ATGGGTCGAA 894 0 

AAAGGCGGGA ATCTGGAGCG AGTGGAGCCA CCCCACCGCT GCCTCCACCC CTCGAAGTGG 9000 

TGAGCACCTC TCCAGGGCTG GCTGGCCCAT GGAATCCCCA ATCCATCCTG TTCCTTCCCC 9060 
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CCCACCCTTT TTTTGAGACA GCGTCTTCAG GTAGCGCATG CTGGCCTTAA ATTCAGTATG 

TAGTCAAGGA TGACCTCGAG CTCCTGGTCT TTTTGTCTCC ACTTAGAGAC AATGGCCAGT 

GGCCATCACC ACCTTTGGGA GACTAGCCAT GGAGTCTATT TAGCCTGTCA TTTGGTGACA 

GATGGAGTAC AACAGTGTGA CCTCTTGTAA GAGAACTGAA GACAGGCTGT TTTTAACCCC 

AATATCCTAG GCTCTCTAGA GGTTAACTTT ATATAAAATA GAGACTATTA CAGCCAGTTA 

TCACATGGTC CCACAGAACC TTTTGTCACA CAACCTATAG ACCACAGTGC CTGTGCCTAC 

CACATAAGGG TCTCTACTGC TGGCCCACCC CTCCAACCCT TAAAAGGTAA CCTAGGCAGC 

15 CTTAATATTT GCAATCCTCC TACCTCAGCC TCTTGAATGC TCAGAAACCA GGCATTAACC 

CAAGTTTCTC TTCTCTGGGT CCCTTTCTTA AGGTGGGAGG GCCTAAAGAT GACTTCCTTT 

GTCCTGAAGA CTCTCCGAGC CCATGGATCT GCACTCTCTA ATATGAAATA TATTGCATAA 

20 

AATGTCTGGC CTCAGTTTCC CCACCTGTCA GGTTTAGGCA GCACAGTCGG TCCAAGACAC 
TTCATTATTT GCAGGCAGTA TAAGAAGAAG CTCCCATCCC CCACCCGCTT CCTCCGGTCC 
25 CTAAGACAGA ATACTTCTAC ACTGAAACTG AACTCTCGCA GACGCATATG CTCACTTTAA 984 0 

TGATGATGAA ATAATGGGGA AACTGAGGCT CCGAGAGATT CCTGGAGGAA GAGGGTCAAA 
ACCAGCTCCA GGAAGCTCTC CAGCCCCCAT CCGGGCCTCT CCAGGTTCTG GGCTTGGCGG 

30 

GAGTGAACAC AGCTGGGAGG GGCTGGAGCC TGGGAGCTTT GGCCCTTGCT CGTGCCCAGC 1 
ACCTGCGATT CTTGCACGGG AGCCAGCAGG CGGCTGCGTC CGCCCGAGAG ACTGAAGAAG 10080 
35 CCGGGGGTAG GGTTGGAGGG AGGTAAGCAG GGGCTGTGGG GGCCGAAGCT TGTGCCAGGG 10140 

CCTGTCAGCG AGTCCCCAGT TTTATTTATG GCGTGAGGCC GATGTCCTTA TCCGCTGGCC 10200 
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TGCTGGGGGA TGGCTGCGGC TGGGGATTGG ACCCAAGGGC TGGCTTCCCA CTCAGTCCTC 10260 

CAGCCCACTC CATGTCACAC CCGTGCATTC TCTGAGGCTT ATCTTGGGAA CCCGCCCTTG 10320 

TTCTGTGCTG TCTGTCTCTA TTTCTGTCAT TCACTTTCCC AGAGCCTTTT TTTTATG CTT 10380 

TTAATATAAC TACGTTTTAA AAATTGCTTT TGTATAATGT GTGTGCCTTC GTGAGCGTGC 1044 0 

GTGCCACAAC ACACACGTGA AGGTTAGAGA ACTTTGTTGA GTAGGCTCCT TCCACCATGT 10500 

GGGACTAGGG CTGGCGACAA GAGCAATTAC TGAGTCATCT CGCCAGCCCC TCACCCCTCA 10560 

CTTCCCATCC TGTTTGGATA GTCATAGGTA ATCGAAGGTA AATCGCTGGC TTTAATTTCG 10620 

TAGCTATCCT GCCTCAGCCT ACCAAGTGCT GTGCTACCAC GTTTGTGGGA GGGGCTCTCC 10680 

TCCCAGTGTC TGGGGGTACA CAGTCCCAAG ATCTCTGCTT TCTAGGTCTT TGTCTTAGTT 10740 

TGCCCCTTGC TTTGTCCGTG TCCCTAGAGT CTCCGGCCCC ACTTAGTCTC CATTGATTTC 10800 

CTTTCTGACC GAATACTCGG TTTTACCTCC CACTGATTTG ACTCCCTCCT TTGCTTGTCT 10860 

CCATCGCCGT GGCATTGCCA TTCCTCTGGG TGACTCTGGG TCCACACCTG ACACCTTTCC 10920 

CAACTTTCCC CAGCCGAAGC TGGTCTGGTA TGGGAGGCCG CCGTCCCGCG CGCGCCTCCT 10980 

GCTGGCCGCG CCCCAACACT GCCGCTCCAT TCTCTTTAGA GCGCCCGGGC CCGGGCGGCG 11040 

GGGTGTGCGA GCCGCGGGGC GGCGAGCCCA GCTCGGGCCC GGTGCGGCGC GAGCTCAAGC 11100 

AGTTCCTCGG CTGGCTCAAG AAGCACGCAT ACTGCTCGAA CCTTAGTTTC CGCCTGTACG 11160 

ACCAGTGGCG TGCTTGGATG CAGAAGTCAC ACAAGACCCG AAACCAGGTA GGAAAGTTGG 11220 

GGGAGGCTTG CGTGGGGGGT AAAGGAGCAG AGGAAGAGAG AGACCCGGGT GAGCAGCCTC 11280 

CACAACACCG CACTCTTCTT TCCAAGCACA GGACGAGGGG ATCCTGCCCT CGGGCAGACG 11340 
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GGGTGCGGCG AGAGGTAAGG GGGTCTGGGT GAGTGGGGCC TACAGCAGTC TAGATGAGGC 11400 
CCTTTCCCCT CCTTCGGTGT TGCTCAAAGG GATCTCTTAG TGCTCATTTC ACCCACTGCA 
AAGAGCCCCA GGTTTTACTG CATCATCAAG TTGCTGAAGG GTCCAGGCTT AATGTGGCCT 
CTTTTCTGCC CTCAGGTCCT GCCGGCTAAA CTCTAAGGAT AGGCCATCCT CCTGCTGGGT 
CAGACCTGGA GGCTCACCTG AATTGGAGCC CCTCTGTACC ATCTGGGCAA CAAAGAAACC 
TACCAGAGGC TGGGCACAAT GAGCTCCCAC AACCACAGCT TTGGTCCACA TGATGGTCAC 
ACTTGGATAT ACCCCAGTGT GGGTAGGGTT GGGGTATTGC AGGGCCTCCC AAGAGTCTCT 
TTAAATAAAT AAAGGAGTTG TTCAGGTCCC GATGGCCAGT GTGTTTGGGG CCTATGTGCT 
GGGGTGGGGG GA 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39; 



v.l He Ser Pro Gin Asp Pro Thr Leu Leu lie Cly Ser Ser Leu Gin Ala Thr C ys Ser 
5 



130 



11460 



11520 



11580 



11640 



11700 



11760 



11820 



11832 



SUBSTITUTE SHEET (RULE 26) 



WO 9S/11225 

lie His Gly Asp Thr Pro 
25 



PCT/GB97/02479 



- 131 - 



SUBSTITUTE SHEET (RULE 26) 



WO 96/11225 
CLAIMS : 



PCT/GB97/02479 



1. A nucleic acid molecule comprising a sequence of 
nucleotides encoding or complementary to a sequence encoding a 
novel haemopoietin receptor or derivative thereof having the 
motif: 

Trp Ser Xaa Trp Ser [SEQ ID N0:1], 
wherein Xaa is any amino acid. 

2 . A nucleic acid molecule according to claim 1 wherein Xaa 
is Asp or Glu. 

3 . A nucleic acid molecule according to claim 1 or 2 wherein 
said nucleic acid molecule is capable of hybridisation under 
low stringency conditions at 421C to: 

5N (A/G)CTCCA(A/G)TC(A/G)CTCCA 3N (SEQ ID N0:7J; and 
5N (A/G)CTCCA(C/T)TC(A/G)CTCCA 3N [SEQ ID N0:8). 

4. A nucleic acid molecule according to claim 3 comprising a 
sequence of nucleotides substantially as set forth in SEQ ID 
NO: 12 or a nucleotide sequence having at least 60% similarity 
to the nucleotide sequence set forth in SEQ ID NO: 12 or a 
nucleotide sequence capable of hybridising thereto under low 
stringency conditions at 42 1C. 

5. A nucleic acid molecule according to claim 3 comprising a 
sequence of nucleotides substantially as set forth in SEQ ID 
NO: 14 or a nucleotide sequence having at least 60% similarity 
to the nucleotide sequence set forth in SEQ ID NO: 14 or a 
nucleotide sequence capable of hybridising thereto under low 
stringency conditions at 421C. 

6 - A nucleic acid molecule according to claim 3 comprising a 
sequence of nucleotides substantially as set forth in SEQ ID 
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NO: 16 or a nucleotide sequence having at least 60% similarity 
to the nucleotide sequence set forth in SEQ id NO: 16 or a 
nucleotide sequence capable of hybridising thereto under low 
stringency conditions at 421C. 

7. A nucleic acid molecule according to claim 3 comprising a 
sequence of nucleotides substantially as set forth in SEQ id 
NO:18 or 24 or a nucleotide sequence having at least 60% 
similarity to the nucleotide sequence set forth in SEQ ID NO- 18 
or 24 or a nucleotide sequence capable of hybridising thereto 
under low stringency conditions at 421C. 

8. A nucleic acid molecule according to claim 3 comprising a 
sequence of nucleotides substantially as set forth in SEQ ID 
NO:28 or a nucleotide sequence having at least 60% similarity 
to the nucleotide sequence set forth in SEQ ID NO: 28 or a 
nucleotide sequence capable of hybridising thereto under low 
stringency conditions at 42 1C. 

9. A nucleic acid molecule according to claim 3 comprising a 
sequence of nucleotides substantially as set forth in SEQ ID 
NO:38 or a nucleotide sequence having at least 60% similarity 
to the nucleotide sequence set forth in SEQ ID NO: 38 or a 
nucleotide sequence capable of hybridising thereto under low 
stringency conditions at 421C. 

10. A nucleic acid molecule according to claim 4 or 5 or 6 or 
7 or 8 or 9 wherein said haemopoietin receptor is of murine 
origin. 

U. A nucleic acid molecule according to claim 9 wherein said 
haemopoietin receptor is of human origin. 

12. An expression vector comprising a nucleic acid molecule 
selected from the list consisting of: 

(i) a nucleotide sequence as set forth in SEQ ID NO: 12; 

(ii) a nucleotide sequence as set forth in SEQ ID NO: 14; 
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(iii) a nucleotide sequence as set forth in SEQ id NO- 16- 

(iv) a nucleotide sequence as set forth in SEQ id NO: 18- 

(v) a nucleotide sequence as set forth in SEQ ID N0-24- 

(vi) a nucleotide sequence as set forth in SEQ id no-28- and 

(vii) a nucleotide sequence as set forth in SEQ ID NO:38. 

13 . A method for cloning a nucleotide sequence encoding a 
haemopoietin receptor having the characteristics of NR6 or a 
derivative thereof, said method comprising searching a 
nucleotide database for a sequence which encodes an amino acid 
sequence as set forth in one or more of SEQ ID NO • 1 SEQ ID 
N0:7 and/or SEQ ID N0:8, designing one or more oligonucleotide 
primers based on the nucleotide sequence located in said 
search, screening a nucleic acid library with said one or more 
oligonucleotides and obtaining a clone therefore which encodes 
NR6 or a part or derivative thereof. 

14. An isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding a haemopoietin receptor or derivative 
thereof having an amino acid sequence substantially as set 
forth in SEQ ID NO:13 or having at least about 50% similarity 
thereto. 

15. An isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding a haemopoietin receptor or derivative 
thereof having an amino acid sequence substantially as set 
forth in SEQ ID NO: IS or having at least about 50% similarity 
thereto. 

16. An isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding a haemopoietin receptor or derivative 
thereof having an amino acid sequence substantially as set 
forth in SEQ ID NO:17 or having at least about 50% similarity 
thereto. 

17. An isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding a haemopoietin receptor or derivative 
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thereof having an amino acid sequence substantially as sec 
forth in SEQ ID NO: 19 or having at least about 50% similarity 
thereto. 

18. An isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding a haemopoietin receptor or derivative 
thereof having an amino acid sequence substantially as set 
forth in SEQ ID NO:25 or having at least about 50% similarity 
thereto. 1 

19 An isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding a haemopoietin receptor or derivative 
thereof having an amino acid sequence substantially as set 
forth in SEQ ID NO: 29 or having at least about 50% similarity 
thereto. 1 

20. An isolated novel haemopoietin receptor comprising the 
amino acid motif : 

Trp Ser Xaa Trp Ser [SEQ ID NO:l] 
wherein Xaa is any amino acid. 

21. An isolated haemopoietin receptor according to claim 20 
wherein Xaa is Asp or Glu. 

22. An isolated haemopoietin receptor according to claim 21 
comprising the amino acid sequence substantially as set forth 
in SEQ ID NO: 13. 

23. An isolated haemopoietin receptor according to claim 21 
comprising the amino acid sequence substantially as set forth 
in SEQ ID NO: 15. 

24. An isolated haemopoietin receptor according to claim 21 
comprising the amino acid sequence substantially as set forth 
in SEQ ID NO: 17. r 
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25. An isolated haemopoietin receptor according to claim 21 
comprising the amino acid sequence substantially as set forth 
in SEQ ID NO: 19. 

26. An isolated haemopoietin receptor according to claim 21 
comprising the amino acid sequence substantially as set forth 
in SEQ ID NO: 25. 

27. An isolated haemopoietin receptor according to claim 21 
comprising the amino acid sequence substantially as set forth 
in SEQ ID NO: 29. 

28. A method for modulating expression of NR6 in a mammal, 
said method comprising contacting a genetic sequence encoding 
said NR6 with an effective amount of a modulator of NR6 
expression for a time and under conditions sufficient to up- 
regulate or down-regulate or otherwise modulate expression of 
NR6, wherein the genetic sequence encoding said NR6 is selected 
from the nucleotide sequence set forth in SEQ ID NO: 12 or 14 or 
16 or 18 or 24 or 28 or 38 or is a sequence having at least 
about 60% similarity to at least one of SEQ ID NO: 12 or 14 or 
16 or 18 or 24 or 28 or 38 and is capable of hybridising 
thereto under low stringency conditions at 421C. 

29. A method of modulating activity of NR6 in a mammal, said 
method comprising administering to said mammal, a modulating 
effective amount of a molecule for a time and under conditions 
sufficient to increase or decrease NR6 activity wherein said 
NR6 comprises an amino acid sequence: 

(i) encoded by a nucleotide sequence selected from the 

nucleotide sequence set forth in SEQ ID NO: 12 or 14 or 16 
or 18 or 24 or 28 or 38 or a nucleotide sequence having 
at least 60% similarity to the nucleotide sequence set 
forth in SEQ ID NO: 12 or 14 or 16 or 18 or 24 or 28 or 38 
and which is capable of hybridising thereto under low 
stringency conditions at 421C; and 
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(ii) substantially as set forth in SEQ ID NO;12 or 14 or 16 or 
18 or 32 or 30 or a sequence having at least 50% 
similarity thereto. 

30. A pharmaceutical composition comprising an NR6 receptor in 
soluble form and one or more pharmaceutically acceptable 
carriers and/or diluents wherein said NR6 comprises the amino 
acid sequence: 

(i) encoded by a nucleotide sequence selected from the 
nucleotide sequence set forth in SEQ ID NO: 12 or 14 or 16 
or 18 or 24 or 28 or 38 or a nucleotide sequence having 
at least 60% similarity to the nucleotide sequence set 
forth in SEQ ID NO: 12 or 14 or 16 or 18 or 24 or 28 or 38 
and which is capable of hybridising thereto under low 
stringency conditions at 42 1C; and 

(ii) substantially as set forth in SEQ ID NO: 12 or 14 or 16 or 
18 or 32 or 30 or a sequence having at least 50% 
similarity thereto. 

31. An isolated antibody or a preparation of antibodies to an 
NR6 receptor, said NR6 receptor comprising the amino acid 
sequence : 

(i) encoded by a nucleotide sequence selected from the 
nucleotide sequence set forth in SEQ ID NO: 12 or 14 or 16 
or 18 or 24 or 28 or 38 or a nucleotide sequence having 
at least 60% similarity to the nucleotide sequence set 
forth in SEQ ID NO:12 or 14 or 16 or 18 or 24 or 28 or 38 
and which is capable of hybridising thereto under low 
stringency conditions at 421C; and 

(ii) substantially as set forth in SEQ ID NO: 12 or 14 or 16 or 
18 or 24 or 28 or 38 or a sequence having at least 50% 
similarity thereto. 

32. A trangenic animal comprising a mutation in at least one 
allele of the gene encoding NR6 . 
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33. A transgenic animal according to claim 33 comprising a 
mutation in two alleles of the gene encoding nr 6 . 

34 A transgenic animal according to claim 33 or 34 wherein 
said animal is a murine animal. 
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12/43 


13/43 
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16/43 
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Fig.2 
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< 3 1 cccagaactct 

9 38 agttt caagacagtgtgtt 

5 8 3 aagaaaagaaataaagaga 

gl2 8 cagcttggtgggtaagggg 

gl73 agccccat ccctaggaatc 

g218 cagctgctgacctccatac 

g2 6 3 ggagacataatcaattaat 

g308 ggcatttatgactgatgtt 

g 353 aatatacctgtttgtattt 

g3 9 8 atttgagacagggcttctc 

g4 4 3 tcactctgtagaccaggct 

g488 t tgtgct t cccaagtgct t 

g533 gcaaaattgcatactttaa 

9 578 actaatgtgtgaattccag 

g623 ctattcttaccctcccccc 

g 668 ttgtgtatgtacatgtgtg 

g713 acttgtagaagttctctcc 



g 758 actaaggtcctcaggctta ' 

g 803 catttcactggccctggat 

g8 4 8 aggtctcttgtagctctag 

g893 gtcatcttgagctgctggt 

g9 3 8 aatgatactcaggcagcac 

g983 ccttgattttgttgcctca 

gl02 8 gtttcttttctttatctgt 

gl0 7 3 ttcctgactcttgaaacat 



Fig.2(i) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



4/43 



tggacgctgaggcaggaggattccca 

tctaggtaatgagaccctgtcaagaa 

caagaaaatgtttataggctgtgaga 

cacttgcctccaatcaagatgacctc 

catggtagaaggagaaagcaaactcg 

atgtgctccaatgtgcacacacacag 

aggatgtatttgcttagatttgagta 

ttaaaatttttatttgattttatgaa 

ggtttggtttggtttgagttttgttt 

tgtgtagtcctggctgtccttggaac 

ggccttgaactcagaaatccgcctgc 

agattaaaggtgtgcactgccattca 

ccccagtatttgggaggcagaggcag 

gctagccaaggatacagagtgagacc 

ccaaaaccccaaaatgtattttgtgc 

ttgcagcacgtaaatgtccaaggaca 

gttcacagtctaagtcctgaattcaa 

tgactgatgaattaatt tttgagata 
ctaggctcaaactatgaactcccaag 
actcttgcttccaccccaagtggtgg 
ttctctggggaaggggctggccttgg 
gcttcaatgagtgcttgggtctcgtt 
gaaatgggtgaacacctgttcaagac 
ccaggcag ggtgagggacttgaaqtg 
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gins 
gii63 

gi2 o 8 



ggctcatcccatgcctaac 
agctgtaatcagcccccag 

L Q A T c s 
CCTGCAAGCTACCTGrTHT 



gl253 

gl298 

gl343 

gl388 

gl433 

gl478 
gl523 
gl568 

gi6i3 



A E G L Y w 
CGCTGAGGGGCTCTArTnn 

E L S R L L 
TGAGCTGTCCCGCCTCTTT 

A N L N G S 
GG CTAACCTTAATGGGTCC 

C H A R D G 

GTGTCACGCCCGAGACnflP 
V G ' 

TGTTGGCTgtaagtggggc 

ttggcaatgacagatttag 
agccatgggctctcacttg 
aggcattgcaactctaggg 



gtaccccacagcttt 



agaa 
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aaagtgtcgtctttgaccccagacac 
DPTLLIGSS 
GACCCCACCCTTCTCATCGGCTCCT C 

I H G D T P G A T 
ATACATGGAGACACACCTGGGGCr A r 



TFNGRRLPS 
ACCTTCAATGGTCGCCGCCTGCCCTC 

NTSTLALAL 
AACACCTCCACCCTGGCCCTGGCCCT 

RQQSGDNLV 
AGGCAGCAGTCAGGAGACAATCTGGT 

SILAGSCLY 
AGCATTCTGGCTGGCTCCTGCCTCTA 

cccagacactcagagatagatggggg 

agcctgggtcttctgtcctggggcag 
catgcaggcatggtcatacccagcac 
acagctgtggctgcactgtcccctgt 

L 

aagctgtcatgtt ttcct tgtagTGC 



Fig.2(iv) 
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gl658 



gl703 



gl748 
gl793 
gl83 8 



gl883 



gl928 



gl973 



g20l8 

g2063 
g2l08 
g2153 
g2l98 



P P E K P F N 
CCCCTGAGAAGCCCTTTAA 

K D L T C R w 
AGGATCTCACGTGCCGCTG 

F L H T N Y S 
TCTTACATACCAACT A PT r 

ccagccaagccttgctgtg 
tgatcaaatatgttcctgt 

W Y G 
cctccacaq GTGGTACGflT 

T V G P H S 
CACTGTGGGCCCTCACTCA 

F T P Y E I 
CTTCACTC CCTATGAGATC 

S A R S D V 
CTCAGCAAGATCTGATGTC 

tgagcccccagtgtccacc 
cgcctcccccccatccccc 
ttagccacagccacggtgg 
taatgcaaagactttcccc 



Fig.2(v) 
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ISCWSRNM 
CATCAGCTGCTGGTCCCGGAACATna 

TPGAHGET 
GACACCGGGTGCACACGGGGAGA TAT 

L K Y 1 K L R 
CCTCAAGTACAAGCTff an^f f Jjr n r 

tgacttctggcaatacttaccttctc 
tt atgaactcaaaagggactctcgca 

QDNTCEEYH 
CAGGATA ACACATGTGAGGAGTAPra 

CHI PKDLAL 
TGCCATATCCCCAAGGACCTGGCrrT 



W VEATNRLG 
TGGGTGGAAGCCACCAATCGCCTAfiff 

LTLDVLDV 
CTCACAC TGGATGTnrTGGACGTRfl ry 

tgtgttctgccctagaccttataggg 
cagactttttggttcttctagaggtc 
ttgcaggacagtggttgttcataact 
caagacagtcaagatttttcccctcc 



Fig.2(vi) 
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g2243 

g22 8 8 

g2333 

g23 78 

g2423 

g24 6 8 

g2513 

g2558 

g2603 

g2 64 8 

g2 6 93 

g2738 

g2783 

g2828 

g2873 

g2 9 18 

g2963 

g3 008 

g3 053 

g3 098 

g3 143 

g3188 

g3233 

g3278 

g3323 

g3368 



gg'cctgaccaccctccctc 

gtcctaggggactgagagg 

ggaagccgaggccttgagc 

acgaactggatgatccctg 

ggtgtt cccagcccaaagc 

gcctcactgaagactcagg 

t ggt cccccaggagggt t c 

tccagaggttttgtgtctt 

ctgtggctggcacagctgc 

aggcatcagaggtggacat 

caaatagcacctcaaggtg 

cctgacgctcagaaagcct 

tcactctgggacatgtagt 

tagctttaagagtcagctt 

taat aggtgctgggtgatg 

tctctgcgctaatctccac 

cttgagggcaggaatgtgt 

gtagcagcaactgctgctg 

taatctatcaggcctgggt 

gtctggaaaacgcagatag 

ttacaccactgggtgttct 

tcctcagaactgggagcac 

taatgccagcattagggga 

ttcaaggccatcctgaatt 

ggtgcgcagtaaaaccttg 



Fig.2(vii) 
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acacacacactctgcagagaacacct 
tctacagcccaggtgt tcagaaggga 
aggcgcccaggtctgaaggcgcccca 
tgggggggggggcgagggttggaggc 
agcacaactgggcctaatctaattag 
agcctgggccatttaacccttcaagt 
ggagagatcagcttgtactctctcca 
ctgggtgcccctggctcattcccaca 
cctggcatctaaccctcagttgtgct 
cccgtggaggctcttggtaatgtaca 
gggatggggatacat agggatggagc 
gggtgatatacaataaagcttgtcac 
actcatgatgatcacaattgttgaca 

gtgacttaatactggaac tcagggcc 
ctcgcctcactccctgtttagtgaga 
cccagctgggtgggctgctctgtccc 
gtcttccatcagagataggacccgtg 
gctgtttctggaatattaaatgacag 
gagtagctaacaggggtgggggcgtg 
ggtcataggagccactgcagcctaga 
gtcactaggccattctcaccaagcag 
tgttgccagcatttaatgccagcatt 
ggcagaggcagaaggatctctctgag 
t acataaagagctccaggccagccag 



tc 



agtg 



Fig.2(viii) 
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g34 13 accaggcttgctccacccc 



V H V S R v G 
g34 5 8 GTGCACGTGAGCCGCGTTG 

R V S P p 

g3 5 03 CGCTGGGTCTCACCACCAG 

K Y Q I R y 
93548 AAGTACCAGATCCGCTACC 
g3593 gtgcccgt cccgccccgga 



g3638 ctgactcct ccctcaccgt 

Q T S C R L A 
93683 AGACCTCCTGCCGTCTCGC 

F V Q V R C N 
93728 TCGTCCAAGTGCGTTGTAA 

K A G I W S E 
g3 7 73 AGGCGGGAATCTGGAGCGA 

T P R s 
g 3 818 CCCCTCGAAGTG gtgagca 
g 3 8 6 3 aatccccaatccatcctgt 



Fig.2(ix) 
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VTTDPpp D 
cagTGACCACGGACCCCCCACCCGAC 



GLEDQLSV 
GGGGCCTGGAGGACCAGCTGAGTGTg 

ALKDFLFQA 
CTCTCAAGGATTTCCTCTTCCAAGCC 

RVEDSVDWK 
• GCGTGGAGGACAGCGTGGACTGGAAG 
cccgcccctgaccccgccccccgcat 

V V D D V S N 
qcaq GTGGTGGATGACGTCAGrA am 

GLKPGTVY 
GGGCCTGAAGCCGGGCACCGTTTACT 

PFGIYGSK 
CCCATT CGGGATCTATGGGTCGAAAA 

WSHPTAAS 
GTGGAGCCACCCCACCGCTGCCTCCA 



cc t ct ccagggctggc tggcccatgg 
tccttcccccccaccctttttttgag 



Fig.2(x) 
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g3908 

g3953 

g3 998 

g4 043 

g4088 

g4133 

g4 1 7 8 

g4223 

g4268 

g4313 

g4358 

g44 03 

g444 8 

g44 93 

g453 8 

g4583 

g4 62 8 

g4673 

g4718 

g4763 

g4808 

g4853 

g4898 

g4943 

g4988 

g5033 



acagcgtcttcaggtagcg 
gtcaaggatgacctcgagc 
gacaatggccagtggccat 
agtctatttagcctgtcat 
tgacctcttgtaagagaac 
t at cctaggct ctct agag 
t tacagccagt t at cacat 
acctatagaccacagtgcc 
tgctggcccacccctccaa 
taatatttgcaatcctcct 
ccaggcattaacccaagtt 
gtgggagggcctaaagatg 
agcccatggatctgcactc 
tgtctggcctcagtttccc 
cggtccaagacacttcatt 
cccatcccccacccgcttc 
tacactgaaactgaactct 
atgatgaaataatggggaa 
gaagagggt caaaaccagc 
gggcct ct ccaggt t ctgg 
aggggctggagcctgggag 
c tgcgat t c t tgcacggga 
gagac t gaagaagccgggg 
gctgtgggggccgaagct t 
agttttatttatggcgtga 
ctgggggatggctgcggct 



Fig.2(xi) 
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catgctggccttaaattcagtatgta 
t c 



caccacctttgggagactagccatgg 
ttggtgacagatggagtacaacagtg 
tgaagacaggctgtttttaaccccaa 

ggtcccacagaaccttttgtcacaca 
tgtgcctaccacat aagggtct ctac 
cccttaaaaggtaacctaggcagcct 
acctcagcctcttgaatgctcagaaa 
t ctct tct ctgggt ccct tt ct taag 
acttcctttgtcctgaagactctccg 

cacctgtcaggtttaggcagcacagt 
atttgcaggcagtataagaagaagct 
ctccggt ccctaagacagaatact t c 
cgcagacgcatatgctcactttaatg 
actgaggctccgagagattcctggag 
tccaggaagctctccagcccccatcc 
gcttggcgggagtgaacacagctggg 
ctttggcccttgct cgtgcccagcac 
gccagcaggcggctgcgtccgcccga 
gtagggt t ggagggagg t aagc aggg 
gtgccagggcctgtcagcgagtcccc 
ggccgatgtccttatccgctggcctg 

ggggattggacccaagggctggcttc 



Fig.2(xii) 
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g5078 

g5123 

g5168 

g52 13 

g52 5 8 

g5303 

g534 8 

g5393 

g5438 

g54 83 

g5528 

g5573 

g5618 

g5663 

g5708 

g5753 

g5798 



g5843 

g5888 
g5933 

g5978 



ccactcagtcctccagccc 
tgaggcttat cttgggaac 

aatataactacgttttaaa 
ttcgtgagcgtgcgtgcca 
tttgttgagtaggctcctt 
caagagcaattactgagtc 
tcccatcctgtttggatag 
ggctttaatttcgtagcta 
gc t accacgt t tgt gggag 
gacacagtcccaagatctc 
gccccttgctttgtccgtgt 
cattgactggtctttcctt 

ccat t cctc tgggtgact c 
actttccccagccgaagct 
gcgcgcgcctcctgctggc 

E R p g 
tcttta gAGCGCCCGGGCC 

G G E P S S 
GGCGGCGAGCCCAGCTCGG 

F L G W L K 
TTCCTCGGCTGGCTCAAGA 

F R L Y D Q 
TTCCGCCTGTACGACCAGT 



Fig.2(xiii) 
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actccatgtcacacccgtgcattctc 
ccgcccttgttctgtgctgtctgtct 
tcccagagccttttttttatgctttt 
aattgcttttgtataatgtgtgtgcc 
caacacacacgtgaaggttagagaac 
c caccatgtgggact agggc tggcga 
atctcgccagcccctcacccctcact 
tcataggtaatcgaaggtaaatcgct 
tcctgcctcagcctaccaagtgctgt 
gggct ct cc tcccagtgt ctgggggt 
tgctt tctaggtctttgtcttagtt t 
ccctagagtctccggccccacttatc 
taccgaatactcggttttacctccca 
tgct tgt ct ccat cgccgtggcat tg 

ggtctggtatgggaggccgccgtccc 
cgcgccccaacactgccgctccattc 

PGGGVCEPR 
CGGGCGGCGGGGTGTGCGAGCCGCGG 

GPVRRELKQ 
GCCCGGTGCGGCGCGAGCTCAAGCAG 
KHAY CSNLS 
AGCACGCATACTGCTCGAACCTTAGT 

WRAWMQKSH 
GGCGTGCTTGGATGCAGAAGTCACAC 



Fig.2(xiv) 
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K T R N Q V 
96023 AAGACCCGAAACCAGGTAG 

G K G A E E 
96068 GGTAAAGGAGCAGAGGAAG 

Q H R T L L 

96113 CAACACCGCACTCTTCTTT 

P R A D G V 
P S G R R G A 
g6l58 CCTCGGGCAGACGGGGTGC : 

9^203 GTGGGGCCTACAGCAGTCT 
g624 8 TGTTGCTCAAAGGGATCTC 
g62 93 GAGCCCCAGGTTTTACTGC 



g6338 CTTAATGTGGCCTCTTTTC 

* 

g6 3 8 3 CTAAGGATAGGCCATCCTC 

g64 2 8 CTGAATTGGAGCCCCTCTG 

g64 7 3 CCAGAGGCTGGGCACAATG 

g6518 ACATGATGGTCACACTTGG 

g65 6 3 GGTATTGCAGGGCCTCCCA 

g6 6 08 TTGTTCAGGTc ccga tggc 

g6653 ggtgggggga 



Fig.2(xv) 
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GKLGEACVG 
GAAAGTTGGGGGAGGCTTGCGTGflnn 

ER DPGEQpp 
AGAGAGACCCGGGTGAGCAGCCTCCA 

SKHRTRGSC 

D E G I L 

CCAAGCACAGGACGAGGGGATCCTGH 

RREVRG S G * 
A R 

GGCGAG AGGTAAGGGGGTCTGGG Tna 
AGATGAGGCCCTTTCCCCTCCTTCGG 
TTAGTGCTCATTTCACCCACTGCAAA 
ATCATCAAGTTGCTGAAGGGTCCAGG 

V L P A K L 
G P A G * 

TGCCCTCAGGTCCTGCCGGCTAAACT 



CTGCTGGGTCAGACCTGGAGGCTCAC 

TACCATCTGGGCAACAAAGAAACCTA 
AGCTCCCACAACCACAGCTTTGGTCC 
ATATACCCCAGTGTGGGTAGGGTTGG 
AGAGTCTCTTTAAATAAATAAAGGAG 
cagtgtgtt tggggcctatgtgctgg 

Fig.2(xvi) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



19/43 



20/43 


21/43 


22/43 


23/43 


24/43 


25/43 


26/43 


27/43 


28/43 


2943 
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GCGGCCGCTG CAGTGATTAC TCACCGCGTG 
TTTTTCCGTG GGGGGATGTG AAGAAGTTTA 
GGAATGCAGG GTTCGGTCCC GTTCCCCAAA 
AAGGGCTCCC TGCACGCGCT CCGGGACATC 
TGAGAAGGGA CCAGAGGCCG GAGACTCCCT 
ACGAAACGAG ACTACAGCGA TGGGAGAGGT 
GACCCATGCA CCCAGAGAAA GGGACTGGTG 
AGGGCTGAAA GAGGATGAAC GGGCTCAGGT 
TGGGTATGGG GGCCCCGTAA GAGGGGCGGG 
GGAGGGGATC CTGGAAAAGC ACCAGGGCTG 
ACAGGATCCC AGATGAGGGG GTGGG AAGCC 
CACGGGCTGG TGGGGAAAGA GTGGGGGGCT 
GTAACTGGGC GGAGGCCGGC CGGGCGGGGC 
GTGCGGGGCC CACGATCAAC CCCCCCCCAG 
CGGGGCGAGC GGCGCATTAG CGCCTTGTCA 
CGCTGTCCGC GCCCAGTGAC GCGCGTGAGG • 
CGCCCCCGCC CCATACCGGC GTTGCAGTCA 
GGGTCGCCCG GGCCCCGTCG CCCAATCCGC 



Fig.3(i) 
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GCGCACCCCA 


CCCGCGGGCC 


GCTGAGTGGA 


60 


GGGAGAACTC 


TTCTGCACCG 


ATGGGAACTA 


120 


GGAC AC AC CT 


CTCCCCATAA 


GCCCACTCAT 


180 


CCCATATCCA 


ATACCCGCAG 


ATATGATAGT 


240 


CCCTGCCTTC 


TGGCTTTCCC 


CCCCCCCTGC 


300 


GGCATGAAGG 


CTTAGGGTGG 


GGAT CGGTAG 


360 


GCAACTTTCA 


AACTCTCTGG 


GGAAGGAAGA 


420 


ACTGCTCAAT 


GTGTGTGTGG 


CGGACCAAAG 


480 


GAAGGTGGAT 


AGGAAGGATC 


CCGGTAGACT 


540 


CGAGCTAGGA 


ACCCATTCGG 


AGTTAAGGGT 


600 


TGGGACGGGC 


GGGACCAGAG 


AGGGAGGTCC 


660 


TCGCGCAGGA 


GGATGGGACG 


TTCAGGAGTG 


720 


GCGCGGTGCC 


CGCGGGCGGT 


GGGAAGGCCG 


780 


GGGCCGGGCC 


GGGCCGGGGG 


CGGGGCCGGG 


840 


ATTTCGGCTG 


CTCAGACTTG 


CTCCGGCCTT 


900 


ACCCGAGCCC 


CAATCTGCAC 


CCCGCAGACT 


960 


CCGCCCGTTG 


CGCGCCACCC 


CCATGCCCGC 


1020 


GCGGCGGCCG 


CCGCGGCCGC 


TGTCCTCGCT 


1080 



Fig.3(ii) 
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GTGGTCGCCT CTGTTGCTCT GTGTCCTCGG 
GTACCGTGCG CCCTGCTCCC CACCTCCCCA 
AGTCGCGGGG GATGGAAGAA GGGGCGCGAG 
GGCGGCCCTC GGGGCGCCCT CACCTGTGGG 
AGTACCCCGT TATACATCAG AGGCCTCTTA 
AGGCTCAGTT TGAAGGACAT CGCAGTGTCC 
GCTTCGGGGC GCACGCCTGT GTCTTGGATA 
GGGCGCACGC TTGGGTGCGT TGGGTTGGGT 
GAAGTGATGA TCCCCGGGGG GAGGGTGGGG 
ATGCGGCCCG GCGTCCCTCG GGACTTGCCT 
CTATAGCAGA CTCCATGCTT TGGTATCCTC 
CGGTCTCATT CAGGCTGCGC TGGGTTGAGA 
CGAGAGCAAG CGTGTCCGGG CACCGCGAGC 
GGGGGTCAGC TGCCGAGAGA ATCCCACTGT 
ATCACCCAAC GCACACATCC CCGCCAGGAT 
CACACCCAAA GACACACAAA AGAGCCCCAC 
CGCGCGCTGC AGCCCAGATG CGTATTCGCA 
ACACACACAC ACACACACAC ACACACACAC 



Fig.3(iii) 
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GGTGCCTCGG 


GGCGGATCGG 


GAGCCCGTGA 


114(1 


GGGAAGCCGG 


GATCCGGCGC 


CCCGGGGGGT 

>~» \-J v^J v»j \j\j Vj jl 


1200 

-L Xi U U 


CGCCACCTGG 


ACGTCCCGGG 


AACAAAGGAA 


1 9fi0 

X o V 


GCTCATGGCA 


CCACCACCCA 


GPPTPPPAAG 


1 79H 
1 jZU 


TCTGTATCCC 


CTTTGCGAGG 


PTGTPTGGPP 

V— - X vj x V— X uuLL 


IjOU 


TGGGACCCCC 


PTPPTTPAGG 


GTGPTGGGAP 


i a a n 


tpagagpgga 

X L jr^v3jrA\»J uuun 


agggaagppt 


LLL X vjvjtLL.Vjt\j 


loUU 


gptggpgpaa 


agtggggtpp 


PPTPPPPPZ\T 
LL X LLLLLi-ix 


IjOU 


CGTTATCGTG 


agppptpptg 

x LL x O 


TPPGPPTGfJP 




ptppgtgggg 


t p prz r 1 nrx p 


LLLL 1 LLLLL 


IboU 


gaagtpptpt 


PP A C"TC2C1TCICZ 


nnPTPAPAAP 


i h a r\ 


GCCTCTAGCG 


ACTGAAATTT 


PGGTGAGGAG 


x o u yj 


CCAGACTTCA 


TTGTPTAAGG 

x x vj x \_» x /ItiwVj* 


GGPAPPPAGT 


X O D U 


CCCAGGAGGA 


ACTCCTGGCC 


TTGAGCCCCC 


1920 


GCGGTCTCCA 


CATCCAGACC 


CTCTCTGGGA 


1980 


TGGCTTATGT 


CCCGTCACCC 


TGCCCTCCGA 


2040 


CACCATCGCG 


GCGCTCGCAT 


TCCATCCTCT 


2100 


ACACACACAC 


ACACACAGAC 


ACGCACACAC 


2160 



Fig.3(iv) 
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PCT/GB97/02479 



24/43 



ACACGCACGC ACACACACGC ACGCCCGCAC 
GCAACACCGG GGTACGCATA TGGTTGAGTG 
ACCCCATCCG GAGACACAGG CCACACCGCA 
TAGTAGTCTT GTGCAGTTTG TCCGCGGTGT 
ACAGGAACCT ACACTCCTGC TTGCCCAAGG 
GACCTTTCCG GGGAGTTGGT GTTGCTGCCA 
GCGCTAAGCT TTGTTTCCGG GCGGGCTGCA 
TGGCGCGTGT GTTTTTTCTT TTAAGGGGGA 
TGCAATCTGT TTGTACTTAC CGTGTGTCTT 
AAAGTGTATG CAGGTACCAG CGGGACAGGA 
GAGGCCACCT TCCCGTTGGC CTTTCAGGGA 
GTGTTCTTTT TAATAACGGC AGCAACTCCG 
GGCCCCGGCT TTGTGGAAAG GAGGGGAAGA 
GGCTTAGGGG GCTGTCAGCT GCTGCTCTGT 
AGTGGCTTTG GCCCATTGTT TGTGGAAGCC 
TACTCCAGAG TCAGGCTTCT CAGTCCGAGC 
GAATCAGGGA AGGGGGTGCC AGGTGGACTA 
AAGGAGAAAG CTTGGGCTTG CCCCCCTCCC 



Fig.3(v) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



25/43 


TPrjTnnTPrr 

X \~\J X 


A PA r P r P r P A r T r P r P 


P7AP7APPPP AP 
L/iLAbbbbAb 




p a p tp p a p a t 1 


Lili L.UUUAU 


LAL lLI L AGG 


2280 




n ptp pp ptp p 


1 v^L 1L1 GGGL 


2340 


PTPTriPAPPP 


LL X v^L-L^LjL- X 


1 1 bi X LAbbbb 


2400 


L.GGC 1 bjVjbxUA 


PPTP A rpprPPP 

GG 1 bA 1 bl GG 


TGACACCCGG 


2460 


AvjUCIovjLjI A 


blllli G AA x 


GCLACCAATA 


2520 


/^i 7v r*r*7\ a papp 
GAla v_ AAL Avjbx 


LbAAbb X GGG 


GGAG I GGGGG 


2580 


GAGAAATTAA 


ATAAGAGGTT 


CTCACACCTC 


2640 


AACAC CTGAC 


CAGCCAGCCG 


GTGGGTCGTA 


2700 


bA 1 GGGGGCC 


CCTGGGGTAT 


GGCTGGGATG 


2760 


ATCTCACACT 


TTTCCCTTTT 


TV TV TV TV /"I TV ^ITV m/"*1 

AAAACACATG 


r\ 

2820 


CATTGGGAAA 


GGGGGAAATA 


TV ^i/irnrp^imTt mTv 

AGCTTGTATA 


2880 


bfbjVjAAbiAAAA 




(j J/C, I LCA 


2940 


CTAGCTTGGC 


ATGTGTGTGC 


CCCAGTCCCC 


3000 


AAGAGGGAGA 


CTGGAGTCCT 


CTATCTCTGG 


3060 


CCAGAGAACG 


TCTTCCCTGT 


TTTATGGAGG 


3120 


CGTTCTGCTG 


AGGACTGTAC 


CAGTCGCTCG 


3180 


CCCTCAAGCC 


ACGAAGGGCA 


GCTGCTAGGC 


3240 



Fig.3(vi) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



26/43 



PCT/GB97/02479 



rn 7\ prpprpripmT\ 


AAAGGGCATT 


ACTCCCCAGC 


CAGACAAATG 


CTGGGGAGGG 


ACAGAGGGGT 


GGTCCCGGGT 


CGGGCAGTGC 


CTCCCACCCT 


GGGTGGGL CG 


GGGTAGAGAC 


GCTGGCACGT 


GCGGGCGGCT 


GGCTGCCTGG 


GACCTCCGGG 


GCCTGCTCCT 


CCTGCTCCTT 


CGCACGGACG 


CCCAAATGCA 


ACTGCGATTG 


CAGGCTTCGC 


CCTGGGAGAA 


GTCATTCAGG 


GCCCAGACTA 


GGGCATGAAG 


GACCGTCCAG 


GGCTGCAGTT 


GCAGCCTCTG 


TTCTCCGAGC 


CTCTTTGGAA 


AATACTCTTT 


TCCTCTCATC 


CCATCCCGGG 


TGCAGTCTTC 


CCTAACCTTT 


TCTTTGCTTC 


CCTCTCCCCT 


TGCCCAACTG 


GGGCTCCAGC 






UVJu 1 lul riuU 


CTCTTTTGCT 


TCTGAGACTT 


AATTTTTTTC 


TCTCTGTACA 


GCCCTGGCTG 


CCCTGGCACT 


ACAAACCTAC 


CTGCCTCTGC 


CTTTCCAGTG 


AGTAGTTAAG 


TGTTTTGCTG 


TGTCTTTATT 



Fig.3(vii) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



27/43 


C ACrLiAC LLLL 


CALxAIjxAC ICC 


ppnnrpri /"*im/^ /"i n 

CCTTCCTGGC 


3300 


\j 1 LjA 1 LA 1 1 Li 


c c c accac rc 


CAGACAGTGG 


3360 




LtCLjCCCACCC 


7V pi pi 7\ 7vn/inr«m 

AGGAAGCGGT 


3420 


CCCAG1 1CA1 


pi P< pi pi 7\ 7VPiPi7V 7V 

GC CGAAGGAA 


TTCTGAATTA 


3480 


GCGGCCCCC 1 


GGCCCCCGCC 


GCTCCGTCTG 


3540 


pirn pi 7\ p*i TV primp 

C 1 (jACACC 1 L 


pi pi /""'im/"'1 7\ /"I /^i /~i /^i 

CGCTGAGCCC 


^nn/"*inTv y-^t ^ ■» 

TGGGACAAGC 


3600 


TV 7\ ptv pi n n/^i /~i /~i 

AAGACCCGCC 


TCCTCCCAAG 


GCCAAATTTG 


3660 


GAAC CATGTT 


GGTGCCACCT 


CATCCATCTG 


3720 


TAGCTTCTTA 


ATAGGAACCT 


GGGGGTGGGT 


3780 


ATCGGTTTTG 


TTTTTGTTTT 


TGTTTTTTCC 


3840 


ACTGTTTTCC 


TCCCTAAGGG 


TTGAGAGCCC 


3900 


nn "7\ n /-i /"i /i tv nn/""i 

TACCCCAGGG 


CCTTTGCACA 


TGGAGTCCCA 


3960 


/ 

I II 1 I 7\ /*^rp/-»/-»7v rp 
L.1 1AC1GLAI 


TTGGCTCTTG 


GTAACTGTCC 


4020 


CCCAGCTCCC 


TCTCTTCTCC 


TCCCCCCTTT 


4080 


TTTTTCTTTT 


TGGCTTTTTG 


AGACAGGGTT 


4140 


CATTCTGTAG 


ACCAGGCTAG 


CCTCAAACTC 


4200 


CTGGCACTAA 


AGATGTGGGC 


CACCACAACT 


4260 


CCTATAGTGA 


CCTCAGTTCC 


TGGCATATTG 


4320 



Fig.3(viii) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT7GB97/02479 



28/43 



TAGGCGATGG ATGGATGAAT GGATGGATGG 
CTTGAATCGT CCTGAGT GAA AAAAGAGACC 
GGCAGCCTGG CCTGCTGGTC TCATGGGAGC 
CACCCTGCCA TCCTGTGTGG CTGACAAGAA 
AGGGAAGCTT GGAATATGTT CCCCTCCTCA 
CCAGCCTATG AGTAGGGCAG CTGTGGGCTG 
GTCCCTCAGG GTGGGTCACA GGATTGAGGT 
AGGAAATGAT TGTGGAGAGT CAGAACTCCT 
GCTTCTGTGG CTGTCCCTTC TCTTGTGGTC 
TGTGAGGAGG GCACGGGGAA AATGAAGGCT 
CCAACAGGGC TCACCTCTCC TCTGGACAGG 
TTTGATTCCC TTCCTTTGGT CTCCTGGGAT 
TTTTAGATAT GTCCATTCTC CAGAAACACA 
ACCACCAGGA CAGACAAAGA ATTGGAGAGG 
TGGCTTATGT GTAATCCCAG AACTCTGGAC 
CAGTGTGTTC TAGGTAATGA GACCCTGTCA 
ATGTTTATAG GCTGTGAGAC AGCTTGGTGG 
CCTCAGCCCC ATCCCTAGGA ATCCATGGTA 

Fig.3(ix) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



29/43 


ATGGATGGAT 


an a TnPTTPP 


A 1 oUAGLAAG 


4380 


TCAGAGAACT 


\J*~\r\ X \j\jJr\\J X X 


A PPTTPPPtv 


444 0 


X V^V^V^ X O X VjTrxri 


aPTTPPPPPA 


CALL xCCCAC 


4500 




PP A P 1 TV TrTT^P 1 


ACACAGACTC 


4560 




PTTPTTPTPP 


LUC 1 GAGGGC 


4620 


V— V^V^ X±±ti\3\J X X 


npp r PAPPr*A a 


GAAGGGGGTG 


4680 


PATTTPPA 7V A 


Lj x (jGL (JATCA 


CAGTGGCCCT 


4740 


fiTTnnn a r ir n r r 

ull v^Ljo'/IIj 1 X 


Vj X ALj AG G G C C 


TTGCATGTGG 


4800 


PTTTH P Z\ P A P 
till ov-. ALAVj 


•PPPPPTPrirnn 


TGTGCTGGGA 


4860 


P A P P P PPT O A 

LA<^UL.Lv_ 1 LA 


bL rTGCCCT T 


CACGGTTCAC 


4920 


L X X LAL 1 Lj 1 


Al GCACAGAT 


TGGCCTCACA 


4980 


WX^X^^TLttNTi^^ri. X X 


fPJ /T" , 2Vr i /T' ,r P'A 
1 AL LAJjtatj 1 A 


GGA1 1 1TACA 


5040 




1 AVjLjIj 1 A I (_ A 


GTGAAAGGAC 


5100 


AAGGAAATTG 


GTAAGCCAGG 


CCATGCTTGA 


5160 


GCTGAGGCAG 


GAGGATTCCA 


AGTTTCAAGA 


5220 


AGAAAAGAAA 


AGAAATAAAG 


AGACAAGAAA 


5280 


GTAAGGGGCA 


CTTGCCTCCA 


ATCAAGATGA 


5340 


GAAGGAGAAA 


GCAAACTCCA 


GCTGCTGACC 


5400 



Fig.3(x) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



30/43 



TCCATACATG TGCTCCAATG TGCACACACA 
TTTGCTTAGA TTTGAGTAGG CATTTATGAC 
GAAAATATAC CTGTTTGTAT TTGGTTTGGT 
GCTTCTCTGT GTAGTCCTGG CTGTCCTTGG 
ACTCAGAAAT CCGCCTGCTT GTGCTTCCCA 
TCAG CAAAAT TGCATACTTT AACCCCAGTA 
ATTCCAGGCT AGCCAAGGAT ACAGAGTGAG 
CCAAAATGTA TTTTGTGCTT GTGTATGTAC 
ACAACTTGTA GAAGTTCTCT CCGTTCACAG 
AGGCTTAGCC ACAGTCTTCT TTATGTACTG 
GAATTAATTT TTGAGATAAG GTCTCTTGTA 
AAGGTCATCT TGAGCTGCTG GTACTCTTGC 
GCAGCACTTC TCTGGGGAAG GGGCTGGCCT 
GAGTGCTTGG GTCTCGTTGT TTCTTTTCTT 
GACTTCCTGA CTCTTGAAAC ATCCAGGCAG 
GCCTAACAAA GTGTCGTCTT TGACCCCAGA 
CCTTCTCATC GGCTCCTCCC TGCAAGCTAC 
CACCGCTGAG GGGCTCTACT GGACCTTCAA 



Fig.3(xi) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 
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L.A(jiG(jjAGALA 


TAATCAATTA 


ATAGGATGTA 


5460 


ItjAiGl 1 1 1A 


AAATTTTTAT 


TTGATTTTAT 


5520 


1 1 GGTTTGAG 


TTTTGTTTAT 


TTGAGACAGG 


5580 


AACTCACTCT 


GTAGACCAGG 


CTGGCCTTGA 


5640 


AGTGCTTAGA 


TTAAAGGTGT 


GCACTGCCAT 


5700 


TTTGGGAGGC 


AGAGGCAGAC 


TAATGTGTGA 


5760 


ACCCTATTCT 


TACCCTCCCC 


CCCCAAAACC 


5820 


ATGTGTGTTG 


CAGCACGTAA 


ATGTCCAAGG 


5880 


TCTAAGTCCT 


GAATTCAAAC 


TAAGGTC CTC 


5940 


AGCCATTTCA 


CTGGCCCTGG 


ATTGACTGAT 


6000 


GCTCTAGCTA 


GGCTCAAA.CT 


ATGAACTCCC 


6060 


TTCCACCCCA 


AGTGGTGGAA 


TGATACTCAG 


6120 


TGGCCTTGAT 


TTTGTTGCCT 


CAGCTTCAAT 


6180 


TATCTGTGAA 


ATGGGTGAAC 


ACCTGTTCAA 


6240 


GGTGAGGGAC 


TTGAAGTGGG 


CTCATCCCAT 


6300 


CACAGCTGTA 


ATCAGCCCCC 


AGGACCCCAC 


6360 


CTGCTCTATA 


CATGGAGACA 


CACCTGGGGC 


6420 


TGGTCGCCGC 


CTGCCCTCTG 


AGCTGTCCCG 


6480 



Fig.3(xii) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



32/43 

CCTCCTTAAC ACCTCCACCC TGGCCCTGGC 
GTCAGGAGAC AATCTGGTGT GTCACGCCCG 
CTATGTTGGC TGTAAGTGGG GCCCCAGACA 
GATTTAGAGC CTGGGTCTTC TGTCCTGGGG 
CATGGTCATA CCCAGCACAG GCATTGCAAC 
TGTGTACCCC ACAGCTTTAG AAAAGCTGTC 
CCTTTAACAT CAGCTGCTGG TCCCGGAACA 
GTGCACACGG GGAGACATTC TTACATACCA 
TACCCAGCCA AGCCTTGCTG TGTGACTTCT 
TTCCTGTTTA TGAACTCAAA AGGGACTCTC 
CACATGTGAG GAGTACCACA CTGTGGGCCC 
CCTCTTCACT CCCTATGAGA TCTGGGTGGA 
TGATGTCCTC ACACTGGATG TCCTGGACGT 
GCCCTAGACC TTATAGGGCG CCTCCCCCCC 
GTCTTAGCCA CAGCCACGGT GGTTGCAGGA 
TTTCCCCCAA GACAGTCAAG ATTTTCCCCT 
CTCTGCAGAG AACACCTGGC CTGACCACCC 
GAGTCCTAGG GGACTGAGAG GAGGCGCCCA 



Fig.3(xiii) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 
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CCTGGCTAAC 


CTTAATGGGT 


CCAGGCAGCA 


6540 


AGACGGCAGC 


ATTCTGGCTG 


GCTCCTGCCT 


6600 


CT CAGAGATA 


GATGGGGGTT 


GGCAATGACA 


6660 


CAGAGCCATG 


GGCTCTCACT 


TGCATGCAGG 


6720 


TCTAGGGACA 


GCTGTGGCTG 


CACTGTCCCC 


6780 


*™4 m^i mmmm _j-i — ■ 

ATGTTTTCCT 


TGTAGTGCCC 


CCTGAGAAGC 


6840 


TGAAGGATCT 


CACGTGCCGC 


TGGACACCGG 


6900 


ACTACTCCCT 


CAAGTACAAG 


CTGAGGTTGG 


6960 


GGCAATACTT 


ACCTTCTCTG 


ATCAAATATG 


7020 


GCACCTCCAC 


AGGTGGTACG 


GTCAGGATAA 


7080 


TCACTCATGC 


CATATCCCCA 


AGGACCTGGC 


7140 


AGCCACCAAT 


CGCCTAGGCT 


CAGCAAGATC 


7200 


GGGTGAGCCC 


CCAGTGTCCA 


CCTGTGTTCT 


7260 


ATCCCCCCAG 


ACTTTTTGGT 


TCTTCTAGAG 


7320 


CAGTGGTTGT 


TCATAACTTA 


ATGCAAAGAC 


, 7380 


CCCCACCCCC 


AACACACACA 


TACACACACA 


7440 


TCCCTCTCTA 


CAGCCCAGGT 


GTTCAGAAGG 


7500 


GGTCTGAAGG 


CGCCCCAGGA 


AGCCGAGGCC 


7560 



Fig.3(xiv) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



34/43 



TTGAGCTGGG GGGGGGGGCG AGGGTTGGAG 
GGGCCTAATC TAATTAGGGT GTTCCCAGCC 
GTGCCTCACT GAAGACTCAG GGGAGAGATC 
GGGTTCCTGG GTGCCCCTGG CTCATTCCCA 
TAACCCTCAG TTGTGCTCTG TGGCTGGCAC 
CAAGGCATCA GAGGTGGACA TGGGATGGGG 
AAGGTGGGGT GATATACAAT AAAGCTTGTC 
GATCACAATT GTTGACATCA CTCTGGGACA 
AGTAGCTTTA AGAGTCAGCT TGTGACTTAA 



GTGATGCTCG CCTCACTCCC TGTTTAGTGA 
GTGGG CTGCT CTGTCCCCTT GAGGGCAGGA 
TGGTAGCAGC AACTGCTGCT GGCTGTTTCT 
CTGGGTGAGT AGCTAACAGG GGTGGGGGCG 
AGCCACTGCA GCCTAGATTA CACCACTGGG 



AGTCCTCAGA ACTGGGAGCA CTGTTGCCAG 
AGGGGAGGCA GAGGCAGAAG GATCTCTCTG 
AGCTCCAGGC CAGCCAGGGT GCGCAGTAAA 
TGACCAGGCT' TGCTCCACCC CCAGTGACCA 




SUBSTITUTE SHEET (RULE 26) 



J 

! 



WO 98/11225 



PCT/GB97/02479 



35/43 



GCACGAACTG 


GATGATCCCT 


GAGCACAACT 


7620 


CAAAGCAGCC 


TGGGCCATTT 


AACCCTTCAA 


7680 


AGCTTGTACT 


CTCTCCATGG 


TCCCCCAGGA 


7740 


CATCCAGAGG 


TTTTGTGTCT 


TCCTGGCATC 


7800 


AGCTGCCCCG 


TGGAGGCTCT 


TGGTAATGTA 


7860 


ATACATAGGG 


ATGGAGCCAA 


ATAGCACCTC 


7920 


ACCCTGACGC 


TCAGAAAGCC 


TACTCATGAT 


7980 


TGTAGTGAGA 


CCCTAGCTCA 


AAACACAGAC 


8040 


TACTGGAACT 


CAGGGCCTAA 


TAGGTGCTGG 


8100 


GATCTCTGCG 


CTAATCTCCA 


CCCCAGCTGG 


8160 


ATGTGTGTCT 


TCCATCAGAG 


ATAGGACCCG 


8220 


GGAATATTAA 


ATGACAGTAA 


TCTATCAGGC 


8280 


TGGTCTGGAA 


AACGCAGATA 


GGGTCATAGG 


8340 


TGTTCTGTCA 


CTAGGCCATT 


CTCACCAAGC 


8400 


CATTTAATGC 


CAGCATTTAA 


TGCCAGCATT 


8460 


AGTTCAAGGC 


CATCCTGAAT 


TTACATAAAG 


8520 


ACCTTGTCTC 


AAAAAACAAA 


GCATCTTTAG 


8580 


CGGACCCCCC 


ACCCGACGTG 


CACGTGAGCC 


8640 



Fig.3(xvi) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



36/43 

GCGTTGGGGG CCTGGAGGAC CAGCTGAGTG 
ATTTCCTCTT CCAAGCCAAG TACCAGATCC 
AGGTGCCCGT CCCGCCCCGG ACCCGCCCCT 
CACCGTGCAG GTGGTGGATG ACGTCAGCAA 
GCCCGGCACC GTTTACTTCG TCCAAGTGCG 
AAAGGCGGGA ATCTGGAGCG AGTGGAGCCA 
TGAGCACCTC TCCAGGGCTG GCTGGCCCAT 
CCCACCCTTT TTTTGAGACA GCGTCTTCAG 
TAGTCAAGGA TGACCTCGAG CTCCTGGTCT 
GGCCATCACC ACCTTTGGGA GACTAGCCAT 
GATGGAGTAC AACAGTGTGA CCTCTTGTAA 
AATATCCTAG GCTCTCTAGA GGTTAACTTT 
TCACATGGTC CCACAGAACC TTTTGTCACA 
CACATAAGGG TCTCTACTGC TGGCCCACCC 

CTTAATATTT GCAATCCTCC TACCTCAGCC i 
CAAGTTTCTC TTCTCTGGGT CCCTTTCTTA 

GTCCTGAAGA CTCTCCGAGC CCATGGATCT "j 
AATGTCTGGC CTCAGTTTCC CCACCTGTCA 

Fig.3(xvii) 

SUBSTITUTE SHEET (RULE 26) 



WO 98711225 



PCT/GB97/02479 



37/43 


TGCGCTGGGT 


CTCACCACCA 


GCTCTCAAGG 


8700 


GCTACCGCGT 


GGAGGACAGC 


GTGGACTGGA 


8760 


GACCCCGCCC 


CCCGCATCTG 


ACTCCTCCCT 


8820 


CCAGACCTCC 


TGCCGTCTCG 


CGGGCCTGAA 


8880 


TTGTAACCCA 


TTCGGGATCT 


ATGGGTCGAA 


8940 


CCCCACCGCT 


GCCTCCACCC 


CTCGAAGTGG 


9000 


GGAATCCCCA 


ATCCATCCTG 


TTCCTTCCCC 


9060 


GTAGCGCATG 


CTGGCCTTAA 


ATTCAGTATG 


9120 


MgtM pM W« 

TTTTGTCTCC 


ACTTAGAGAC 


AATGGCCAGT 


9180 


GGAGTCTATT 


TAGCCTGTCA 


TTTGGTGACA 


9240 


GAGAACTGAA 


GACAGGCTGT 


TTTTAACCCC 


9300 


ATATAAAATA 


GAGACTATTA 


CAGCCAGTTA 


9360 


CAACCTATAG 


ACCACAGTGC 


CTGTGCCTAC 


9420 


CTCCAACCCT 


TAAAAGGTAA 


CCTAGGCAGC 


9480 


TCTTGAATGC 


TCAGAAACCA 


GGCATTAACC 


9540 


AGGTGGGAGG 


GCCTAAAGAT 


GACTTCCTTT 


9600 


GCACTCTCTA 


ATATGAAATA 


TATTGCATAA 


9660 


GGTTTAGGCA 


GCACAGTCGG 


TCCAAGACAC 


9720 



Fig.3(xviii) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 



PCT/GB97/02479 



38/43 



TTCATTATTT GCAGGCAGTA TAAGAAGAAG 
CTAAGACAGA ATACTTCTAC ACTGAAACTG 
TGATGATGAA ATAATGGGGA AACTGAGGCT 
ACCAGCTCCA GGAAGCTCTC CAGCCCCCAT 
GAGTGAACAC AG CTGGGAGG GGCTGGAGCC 
ACCTGCGATT CTTGCACGGG AGCCAGCAGG 
CCGGGGGTAG GGTTGGAGGG AGGTAAGCAG 
CCTGTCAGCG AGTCCCCAGT TTTATTTATG 
TGCTGGGGGA TGGCTGCGGC TGGGGATTGG 
CAGCCCACTC CATGTCACAC CCGTGCATTC 
TTCTGTGCTG TCTGTCTCTA TTTCTGTCAT 
TTAATATAAC TACGTTTTAA AAATTGCTTT 
GTGC CACAAC ACACACGTGA AGGTTAGAGA 
GGGACTAGGG CTGGCGACAA GAGCAATTAC 
CTTCCCATCC TGTTTGGATA GTCATAGGTA 
TAGCTATCCT GCCTCAGCCT ACCAAGTGCT 
TCCCAGTGTC TGGGGGTACA CAGTCCCAAG 
TGCCCCTTGC TTTGTCCGTG TCCCTAGAGT 

Fig.3(xix) 

SUBSTITUTE SHEET (RULE 26) 



WO 98/11225 
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CTCCCATCCC 


CCACCCGCTT 






AACTCTCGCA 


GACGCATATG 


PTPAPTTT'A A 
^ x v^i-iv^ xxx AA 


9840 


C CGAGAG ATT 


CCTGGAGGAA 




9900 


CCGGGCCTCT 


CCAGGTTCTG 


GGPTTRnrnr 


9960 


TGGGAGCTTT 


GGCCCTTGCT 


X L U LA L 


10020 


CGGCTGCGTC 


CGCCCGACJAn 


1 vjAAUAAG 


10080 


GGGCTGTGGG 


GGCCGAAfJPT 


TnnPPPPTy r^r*r* 
ibl VjL. L.A(jGG 


10140 


GCGTGAGGCC 


gatgtpptta 


T 1 P 1 PP P* T 1 /^ r+ r* n 


10200 


ACCCAAGGGC 


x vtov^ x X LLLR 


v- 1 CAGTCCTC 


10260 


TCTGAGGCTT 


" x >— x x uuuiiH 


P 1 P* 1 P 1 /■"* P* Try m /"i 


10320 


TCACTTTCCC 


AGAGCPTTTT 

nwnvj Vw Vw X X X X 


111 IAILjCI 1 


10380 


TGTATAATGT 


Vj X VJ X ULL J. 1L 


1 CjjACjCGj TGC 


10440 


ACTTTGTTGA 


GTAGGfTPnT 


T'P^P 1 A PP7\ rp/^irn 

1 LLALLAlbT 


10500 




PPPP7V r+r*r*r*r*\ 


TCACCCCTCA 


10560 


ATCGAAGGTA 


AATCGCTGGC 


TTTAATTTCG 


10620 


GTGCTACCAC 


GTTTGTGGGA 


GGGGCTCTCC 


10680 


ATCTCTGCTT 


TCTAGGTCTT 


TGTCTTAGTT 


10740 


CTCCGGCCCC 


ACTTAGTCTC 


CATTGATTTC 


10800 



Fig.3(xx) 

SUBSTITUTE SHEET (RULE 2B) 



98/11225 



PCT/GB97/02479 



40/43 



CTTTCTGACC 


GAATACTCGG 


TTTT A PPTPP 

ill IriLLILL 


CCATCGCCGT 


GGPATTGPPA 


l l w w J.L1 vjvjo 


CAACTTTCCC 


CAGCCGAAGC 


x wvjr xv x vjrvj 1 J\ 


GCTGGCCGCG 


CCCCAACACT 


uLUoL 1 LLHl 


gggtgtgpga 

WWW X W X WW w<rt. 


WW WW WWWWWW 


ubLbAbLLCA 


AGTTPPTPGG 


PTGPPTPAAfl 


iiHbLiiLbLAl 


ACCAGTGGPG 


TGPTTGGATn 

X uL X X WWX - ! X w 


P7\PA A f^T/^* A r* 
wi-iwili-iw l LAL 


yJ\J\Jr\vj\j w 1 lu 


1 w^wwww^ l 


A A A (~*f~* A O A /^i 

AAALxvj Ab (JAG 


PAPA APAPPf^ 


wi-lw 1L11L11 


1 LLAAbLACA 


GGGTG PGGPG 




ww\j 1 w 1 \jw\j 1 


PPTTTPPPPT 


PPTTPHHTYtlT 
V— V— X x www lul 


1 Ljw 1 LMAbb 


AAGAGPPPPA 


cin r v r v r v r v a r^nrr 1 

wVj llll i-\w 1 


LAlLAlLAAb 


CTTTTCTGCC 


CTCAGGTCCT 


GCCGGCTAAA 


CAGACCTGGA 


GGCTCACCTG 


AATTGGAGCC 


TACCAGAGGC 


TGGGCACAAT 


GAGCTCCCAC 


ACTTGGATAT 


ACCCCAGTGT 


GGGTAGGGTT 


TTAAATAAAT 


AAAGGAGTTG 


TTCAGGTCCC 


GGGGTGGGGG 


GA 





Fig.3(xxi) 
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CACTGATTTG 


ACTCCCTCCT 


TTGCTTGTCT 


10860 


TGACTCTGGG 


TCCACACCTG 


ACACCTTTCC 


10920 


TGGGAGGCCG 


CCGTCCCGCG 


CGCGCCTCCT 


10980 


TCTCTTTAGA 
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